For those interested, made some MXFP4 GGUFs at

danielhanchen • today at 9:40 AM • 1 reply • view on HN

plagiarist • today at 2:30 PM

Are smaller 2/3-bit quantizations worth running vs. a more modest model at 8- or 16-bit? I don't currently have the vRAM to match my interest in this

➕ show 1 reply

alt Hacker News