Nobody runs unquantized, there's literally no reason to. Q8 would be the largest anyone actually runs on consumer hardware for inference.
Halving the precision of the weights is not a free lunch...
Halving the precision of the weights is not a free lunch...