I truly think by 2028 we'll have integrated chip systems that'll be able to run opus 4.8 level models at ~500 watts at acceptable performance. Honestly I think now is the worst time to invest in AI hardware. Get your harness ready and processes perfected with hosted models, and wait a few years to buy hardware to transition to running models locally
if such hardware becomes available, it will be bought by the data-centers, just like they buy all the RAM today
Honestly I think now is the worst time to invest in AI hardware.
That position is not without its own risks, though. Maybe Opus 4.8 will run on a single chip by 2028... and maybe you won't be allowed to touch it.
And what if Xi makes a play for Taiwan? That would be stupid, but so was invading Ukraine with tanks from Temu, and it still happened.
Burning weights onto a chip in an efficient way and exposing that via USB would be acceptable for a good enough model tbh