logoalt Hacker News

me_bxtoday at 8:12 PM0 repliesview on HN

TIL:

> Quantization-Aware Training (QAT) [...] allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model