logoalt Hacker News

tcdent08/08/20251 replyview on HN

I incorporated the quantization aspect because it's not that simple.

Yes, old hardware will be slower, but you will also need a significant amount more of it to even operate.

RAM is the expensive part. You need lots of it. You need even more of it for older hardware which has less efficient float implementations.

https://developer.nvidia.com/blog/floating-point-8-an-introd...


Replies

fredmcawesome08/09/2025

But surely this is short term? Once you get older hardware with FP4 support this shouldn't be a concern.