>within a few years
Eventually, we'll see. Frontier models still need some pretty serious hardware which will slowly come down in cost. Smaller models are becoming more capable, which will presumably continue to improve.
I think there's still a pretty big gap, though. Claude estimates Opus 4.6 and GLM-5 need about 1.5Ti VRAM. It puts gpt-5.5 around 3-6Ti of VRAM.
That's 8x Nvidia H200 @ ~$30k USD each. Still need some big efficiency improvements and big hardware cost reduction.
Or a single mlx cluster if one can find second hand machines somewhere. Difficult to get your hands on today, certainly, but not impossible.