For consumers, there's little reason to run unquanted, especially for large models which take less of a hit from quantization. I'm running a 200b model at Q3 with very little degradation. A 1000b model would see even less change.