This isn't a hardware feat, this is a software triumph.
They didn't make special purpose hardware to run a model. They crafted a large model so that it could run on consumer hardware (a phone).
The iPhone 17 Pro launched 8 months ago with 50% more RAM and about double the inference performance of the previous iPhone Pro (also 10x prompt processing speed).
>triumph
It’s been a lot of years, but all I can hear after reading that is … I’m making a note here, huge success
both, tbh
It's both.
We haven't had phones running laptop-grade CPUs/GPUs for that long, and that is a very real hardware feat. Likewise, nobody would've said running a 400b LLM on a low-end laptop was feasible, and that is very much a software triumph.