logoalt Hacker News

zozbot234yesterday at 2:39 PM0 repliesview on HN

And then only Apple devices have 512GB of unified memory, which matters when you have to combine larger models (even MoE) with the bigger context/KV caching you need for agentic workflows. You can make do with less, but only by slowing things down a whole lot.