logoalt Hacker News

pseudosavantlast Thursday at 7:18 PM18 repliesview on HN

I don't know how many others here have a CoPilot+ PC but the NPU on it is basically useless. There isn't any meaningful feature I get by having that NPU. They are far too limited to ever do any meaningful local LLM inference, image processing or generation. It handles stuff like video chat background blurring, but users' PC's have been doing that for years now without an NPU.


Replies

kenjacksonlast Thursday at 7:34 PM

I'd love to see a thorough breakdown of what these local NPUs can really do. I've had friends ask me about this (as the resident computer expert) and I really have no idea. Everything I see advertised for (blurring, speech to text, etc...) are all things that I never felt like my non-NPU machine struggled with. Is there a single remotely killer application for local client NPUs?

show 6 replies
skrebbellast Thursday at 7:40 PM

I have one as well and I simply don’t get it. I lucked into being able to do somewhat acceptable local LLM’ing by virtue of the Intel integrated “GPU” sharing VRAM and RAM, which I’m pretty sure wasn’t meant to be the awesome feature it turned out to be. Sure, it’s dead slow, but I can run mid size models and that’s pretty cool for an office-marketed HP convertible.

(it’s still amazing to me that I can download a 15GB blob of bytes and then that blob of bytes can be made to answer questions and write prose)

But the NPU, the thing actually marketed for doing local AI just sits there doing nothing.

SomeHacker44yesterday at 9:22 AM

Also the Copilot button/key is useless. It cannot be remapped to anything in Ubuntu because it sends a sequence of multiple keycodes instead if a single keycode for down and then up. You cannot remap it to a useful modifier or anything! What a waste of keyboard real estate.

show 1 reply
janalsncmlast Thursday at 8:53 PM

If I had to steelman Dell, they probably made a bet a while ago that the software side would have something for the NPU, and if so they wanted to have a device to cash in on it. The turnaround time for new hardware was probably on the order of years (I could be wrong about this).

It turned out to be an incorrect gamble but maybe it wasn’t a crazy one to make at the time.

There is also a chicken and egg problem of software being dependent on hardware, and hardware only being useful if there is software to take advantage of its features.

That said I haven’t used Windows in 10 years so I don’t have a horse in this race.

show 2 replies
dworksyesterday at 1:23 AM

What we want as developers: To be able to implement functionality that utilizes a model for tasks like OCR, visual input and analysis, search or re-ranking etc, without having to implement an LLM API and pay for it. Instead we'd like to offer the functionality to users, possibly at no cost, and use their edge computing capacity to achieve it, by calling local protocols and models.

What we want as users: To have advanced functionality without having to pay for a model or API and having to auth it with every app we're using. We also want to keep data on our devices.

What trainers of small models want: A way for users to get their models on their devices, and potentially pay for advanced, specialized and highly performant on-device models, instead of APIs.

show 1 reply
GrantMoyerlast Thursday at 7:53 PM

The idea is that NPUs are more power efficient for convolutional neural network operations. I don't know whether they actually are more power efficent, but it'd be wrong to dismiss them just because they don't unlock new capabilties or perform well for very large models. For smaller ML applications like blurring backgrounds, object detection, or OCR, they could be beneficial for battery life.

show 2 replies
pseudosavantyesterday at 5:51 PM

I did some research on if the transistor budget for the NPU was spent on something else in the SoC/CPU, what could you get?

You could have 4-10 additional CPU cores, or 30-100MB more L3 cache. I would definitely rather have more cores or cache, than a slightly more efficient background blurring engine.

zozbot234last Thursday at 7:58 PM

NPUs overall need better support from local AI frameworks. They're not "useless" for what they can do (low-precision bulk compute, which is potentially relevant for many of the newer models) and they could help address thermal limits due to their higher power efficiency compared to CPU/iGPU. but that all requires specialized support that hasn't been coming.

hacker_homielast Thursday at 9:16 PM

Yeah, that's because the original npus were a rush job, the amd AI Max is the only one that's worth anything in my opinion.

show 3 replies
simulator5gyesterday at 1:36 AM

If you do use video chat background blurring, the NPU is more efficient at it vs using your cpu or gpu. So the feature it supports is longer battery life, and less resource usage on your main chips, and better performance for the things that NPUs can do. E.g higher video quality on your blurred background.

show 1 reply
heavyset_golast Thursday at 8:44 PM

The stacks for consumer NPUs are absolutely cursed, this does not surprise me.

They (Dell) promised a lot in their marketing, but we're like several years into the whole Copilot PC thing and you still can barely, if at all, use sane stacks with laptop NPUs.

generalizationslast Thursday at 11:04 PM

NPUs were pushed by Microsoft, who saw the writing on the wall: AI like chatgpt will dominate the user's experience, edge computing is a huge advantage in that regard, and Apple's hardware can do it. NPUs are basically Microsoft trying to fudge their way to a llamacpp-on-Apple-Silicon experience. Obviously it failed, but they couldn't not try.

show 2 replies
shrubbleyesterday at 2:49 AM

The NPU is essentially the Sony Cell "SPE" coprocessor writ large.

The Cell SPE was extremely fast but had a weird memory architecture and a small amount of local memory, just like the NPU, which makes it more difficult for application programmers to work with.

withinrafaelyesterday at 4:27 AM

The Copilot Runtime APIs to utilize the NPU are still experimental and mostly unavailable. I can't believe an entire generation of the Snapdragon X chip came and went without working APIs. Truly incredible.

simulator5gyesterday at 1:33 AM

If you do use video chat background blurring, the NPU is more efficient at it vs using your cpu or gpu. So the feature it supports is longer battery life and less resource usage on your main chips.

show 1 reply
greenchairlast Thursday at 10:07 PM

I've got one anecdote: friend needed Live Captions for a translating job and had to get copilot+ PC just for that.

kroojlast Thursday at 9:51 PM

Question - from the perspective of the actual silicon, are these NPUs just another form of SIMD? If so, that's laughable sleight of hand and the circuits will be relegated to some mothball footnote in the same manner as AVX512, etc.

To be fair, SIMD made a massive difference for early multimedia PCs for things like music playback, gaming, and composited UIs.

show 2 replies