logoalt Hacker News

fnord77yesterday at 10:12 PM1 replyview on HN

> you now are having a business is planned by a paid chatbot, they can shutdown anytime or make it more expensive

Local models are about 25 months behind the current SOTA. If that holds, businesses won't need the paid models for many things.


Replies

hadlockyesterday at 11:29 PM

I suspect you'll (a small-medium business) be able to buy a Claude 4.6-class rack mount device for $6000 by 2030 that does 100 t/s with 1 million token context, which honestly, is probably adequate for an office (front office, back office, executive tier etc) of 10-300 unless you've got more than 4 engineers on staff. That kind of offline device is going to push everyone to provide that kind of cloud-enabled baseline service at very low cost. The Qwen 3.5 series is already showing you can almost (but not quite) squeeze that kind of performance out of consumer hardware. 256/512gb consumer video cards will get us there, eventually, if capacity ever catches up with demand.