I incorporated the quantization aspect because it's not that simple.
Yes, old hardware will be slower, but you will also need a significant amount more of it to even operate.
RAM is the expensive part. You need lots of it. You need even more of it for older hardware which has less efficient float implementations.
https://developer.nvidia.com/blog/floating-point-8-an-introd...
But surely this is short term? Once you get older hardware with FP4 support this shouldn't be a concern.