> the reality is that AI would change everything we do
Your true believer convictions don't matter here. Those AI accelerators are merely just marketing stunts. They won't help your local inference because they are not general purpose enough for that, they are too weak to be impactful, most people won't ever run local inference because it sucks and is a resource hog most can't afford, and it goes against the interests of those scammy unprofitable corporations who are selling us LLMs as AI as the silver bullet to every problem and got us there in the first place (they are already successful in that, by making computing unaffordable). There's little to no economical and functional meaning to those NPUs.
> most people won't ever run local inference because it sucks and is a resource hog most can't afford
a) Local inference for chats sucks. Using LLMs for chatting is stupid though.
b) Local inference is cheap if you're not selling a general-purpose chatbot.
There's lots of fun stuff you can get with a local LLM that previously wasn't economically possible.
Two big ones are gaming (for example, text adventure games or complex board games like Magic the Gathering) and office automation (word processors, excel tables).
> most people won't ever run local inference because it sucks and is a resource hog most can't afford
You have fallen headfirst into the "Not now, so never" fallacy. As if consumer hardware won't get more powerful, or models more economical.