That doesn't mean anything, it's just a name change. They're the same kind of unit.
And whatever accelerator you try to put into it, you're not running Gemini3 or GPT-5.1 on your laptop, not in any reasonable time frame.
Also it does mean something. An NPU is completely different from your 5070. Yes the 5070 has specific AI cores but it also has raster cores and other things not present in an NPU.
You dont need to run GPT5.1 to summerize a webpage. Models are small and specialized for different tasks.
No, NPUs are designed to be power efficient in ways GPU compute aren't.
You also don't need Gemini3 or GPT anything running locally.
Over the last few decades I've seen people make the same comment about spell checking, voice recognition, video encoding, 3D rendering, audio effects and many more.
I'm happy to say that LLM usage will only actually become properly integrated into background work flow when we have performant local models.
People are trying to madly monetise cloud LLMs before the inevitable rise of local only LLMs severely diminishes the market.