logoalt Hacker News

cubefoxyesterday at 3:45 PM0 repliesview on HN

I assume that theoretically, 1-bit models could be most efficient because modern models switched from 32 bit to 16 bit to 8 bit per parameter (without quantization).