logoalt Hacker News

pzolast Thursday at 12:02 PM1 replyview on HN

I'm not sure if running any local model on smartphones make sense right now for most of the people. I played with many tiny llm models on ollama and only 14b models can feel to be usable - otherwise why not just run any app that do inference in cloud and run SOTA models? Even o4-mini feels more usable than those tiny models on smartphones - apple would have to ship iphones with 24GB RAM but inference would still be slower than those in the cloud. Paying $200 yearly subscription for chatgpt/claude/perplexity/gemini still seems cheaper than buying new iPhone just for that.

I wish though they would ship some beefy apple tv/mac mini/router with 32GB RAM that can work not only as private llm but also private iCloud, vpn, router, pihole, etc


Replies

kallebooyesterday at 12:58 AM

> I'm not sure if running any local model on smartphones make sense right now for most of the people

It probably makes sense for simpler tasks like summarizing text messages/email that you might not necessarily want to send off to a third party that has a "move fast and break things" approach to data privacy