Leading to my question: Ok keeping a zero and a minus-zero does make sense for some limits calculati...

polotics • today at 6:40 PM • 1 reply • view on HN

Leading to my question: Ok keeping a zero and a minus-zero does make sense for some limits calculations... But when all you have is 4 bits, is this not quite wasteful? Would using the bits for eg. a 2.5 not improve the model?

Replies

polotics • today at 6:52 PM

Oh well that's a rabbit hole: NVIDIA Blackwell has this, also GGUFs sidestep this with Qi_j / Qi_K... Great article, spikes curiosity!

alt Hacker News

Replies