logoalt Hacker News

jillesvangurptoday at 10:20 AM1 replyview on HN

You can always delegate sub agents to cloud based infrastructure for things that need more intelligence. But the future indeed is to keep the core interaction loop on the local device always ready for your input.

A lot of stuff that we ask of these models isn't all that hard. Summarize this, parse that, call this tool, look that up, etc. 99.999% really isn't about implementing complex algorithms, solving important math problems, working your way through a benchmark of leet programming exercises, etc. You also really don't need these models to know everything. It's nice if it can hallucinate a decent answer to most questions. But the smarter way is to look up the right answer and then summarize it. Good enough goes a long way. Speed and latency are becoming a key selling point. You need enough capability locally to know when to escalate to something slower and more costly.

This will drive an overdue increase in memory size of phones and laptops. Laptops especially have been stuck at the same common base level of 8-16GB for about 15 years now. Apple still sells laptops with just 8GB (their new Neo). I had a 16 GB mac book pro in 2012. At the time that wasn't even that special. My current one has 48GB; enough for some of the nicer models. You can get as much as 256GB today.


Replies

zozbot234today at 10:26 AM

> This will drive an overdue increase in memory size of phones and laptops.

DRAM costs are still skyrocketing, so no, I don't think so. It's more likely that we'll bring back wear-resistant persistent memory as formerly seen with Intel Optane.