The stupid difference is supposed to be that they have some tensor math accelerators like the ones that have been on GPUs for three generations now. Except they’re small and slow and can barely run anything locally, so if you care about “AI” you’re probably using a dedicated GPU instead of a “NPU”.
And because local AI features have been largely useless, so far there is no software that will, say, take advantage of NPU processing for stuff like image upscaling while using the GPU tensor calculations for in-game raytracing or whatever. You’re not even offloading any workload to the NPU when you’re using your GPU, regardless of what you’re using it for.
For Apple stuff where it’s all integrated it’s probably closer to what you describe, just using the integrated GPU acceleration. I think there are some specific optimizations for the kind of tensor math used in AI as opposed to graphics, but it’s mostly the same thing.
The idea is having tensor acceleration built into SoCs for portable devices so they can run models locally on laptops, tablets and phones.
Because, you know, server-side ML model calculations are expensive, so offloading compute to the client makes them cheaper.
But this gen can’t really run anything useful locally so far, as far as I can tell. Most of the demos during the ramp-up to these were thoroughly underwhelming and nowhere near what you get from server-side services.
Of course they could have just called the “NPU” a new GPU feature and make it work closer to how this is run on dedicated GPUs, but I suppose somebody thought that branding this as a separate device was more marketable.
The stupid difference is supposed to be that they have some tensor math accelerators like the ones that have been on GPUs for three generations now. Except they’re small and slow and can barely run anything locally, so if you care about “AI” you’re probably using a dedicated GPU instead of a “NPU”.
And because local AI features have been largely useless, so far there is no software that will, say, take advantage of NPU processing for stuff like image upscaling while using the GPU tensor calculations for in-game raytracing or whatever. You’re not even offloading any workload to the NPU when you’re using your GPU, regardless of what you’re using it for.
For Apple stuff where it’s all integrated it’s probably closer to what you describe, just using the integrated GPU acceleration. I think there are some specific optimizations for the kind of tensor math used in AI as opposed to graphics, but it’s mostly the same thing.
Seems silly to try to get the CPU to do GPU stuff, just upgrade the GPU.
The idea is having tensor acceleration built into SoCs for portable devices so they can run models locally on laptops, tablets and phones.
Because, you know, server-side ML model calculations are expensive, so offloading compute to the client makes them cheaper.
But this gen can’t really run anything useful locally so far, as far as I can tell. Most of the demos during the ramp-up to these were thoroughly underwhelming and nowhere near what you get from server-side services.
Of course they could have just called the “NPU” a new GPU feature and make it work closer to how this is run on dedicated GPUs, but I suppose somebody thought that branding this as a separate device was more marketable.