This is why the AI companies are rushing to IPO. By the end of next year you’ll be running most of y...

an0malous • yesterday at 10:41 PM • 4 replies • view on HN

This is why the AI companies are rushing to IPO. By the end of next year you’ll be running most of your AI on device. They have no moat, they’ve reached the limits of scaling, most of the magic can be distilled into smaller models, and they know it

Replies

hadlock • yesterday at 11:15 PM

Qwen's ~30B-class models are genuinely good enough for use if you can find a machine with enough memory bandwidth to run them at 30-90 tokens/second. It's been extremely telling that Qwen stopped releasing 120b class models. At some point in the next 10 years (maybe 3?) someone is going to release an Opus 4.5 class 256B model you can run locally. Right now our engineers use about $800/mo worth of opus tokens; at that rate the ROI for local LLM is ~10 months

➕ show 1 reply

cat5e • yesterday at 11:15 PM

Huzzah, they’ve lost their stranglehold. Viva la revolution!

sealeck • yesterday at 11:00 PM

Have we reached the limits of scaling? Sadly it appears that larger model still equals better model

➕ show 4 replies

ActorNightly • yesterday at 11:37 PM

Very false.

I use small models exclusively. They aren't a replacement for large models. You need decent hardware to run those models efficiently, as smaller parameter models plain suck and are still slow on macbooks. And affordability of higher end hardware is very limited.

Even at non VC subsidized $/token prices, its still much cheaper to run cloud based models.

➕ show 2 replies

alt Hacker News

Replies