logoalt Hacker News

simopatoday at 2:57 PM2 repliesview on HN

It's crazy to see a 400B model running on an iPhone. But moving forward, as the information density and architectural efficiency of smaller models continue to increase, getting high-quality, real-time inference on mobile is going to become trivial.


Replies

anemlltoday at 5:49 PM

Probably 2x speed for Mac Studio this year if they do double NAND ( or quad?)

volemotoday at 4:23 PM

> moving forward, as the information density and architectural efficiency of smaller models continue to increase

If they continue to increase.

show 2 replies