logoalt Hacker News

toufkayesterday at 8:23 PM1 replyview on HN

I mean, the most cutting edge of iPhones, iPads and MacBook Pros _today_ are quite capable of running in realtime today’s high-end local LLMs.

If you project out that hardware just a couple of years, and the trained models out a couple of years, you end up in a place where it makes so much more sense to run them locally, for all sorts of latency, privacy, efficacy, and domain-specific reasons.

Not all that different from the old terminal & mainframe->pc shifts.

Finally - hardware has seemingly gotten out ahead of software that most folks use - watching YouTube, listening to music, playing a game or two. There was a time when playing an mp3 or watching a 4k video really taxed all but the nicest systems. Hardware fixed that problem, like it very well could this one.


Replies

sofixayesterday at 8:56 PM

> I mean, the most cutting edge of iPhones, iPads and MacBook Pros _today_ are quite capable of running in realtime today’s high-end local LLMs

Definitely not the high end local LLMs. The small ones, yes, absolutely.

> If you project out that hardware just a couple of years

One of the biggest bottlenecks for LLMs is memory capacity and bandwidth. With the current glut for memory, it's unlikely we'll see lots of advancements in terms of average memory available or its bandwidth on regular (not super high end devices) in the coming years.

Alternatively, it's possible we get dedicated SMLs for e.g. phone specific use cases, that are optimised and run well.