I suspect you'll (a small-medium business) be able to buy a Claude 4.6-class rack mount device for $6000 by 2030 that does 100 t/s with 1 million token context, which honestly, is probably adequate for an office (front office, back office, executive tier etc) of 10-300 unless you've got more than 4 engineers on staff. That kind of offline device is going to push everyone to provide that kind of cloud-enabled baseline service at very low cost. The Qwen 3.5 series is already showing you can almost (but not quite) squeeze that kind of performance out of consumer hardware. 256/512gb consumer video cards will get us there, eventually, if capacity ever catches up with demand.