Don't you need two 512GB ones for unquanted latest chinese models?
Yes and the result of this $10k endeavour is a much slower a dumber model than any SoTA $20/mo API. On top of the maintenance burden to keep software/models updated.
Getting 512 GB of ram at the price point is cheaper than everything else. That’s why Apple stopped production to divert for the M5 ultra.
For consumers, there's little reason to run unquanted, especially for large models which take less of a hit from quantization. I'm running a 200b model at Q3 with very little degradation. A 1000b model would see even less change.