TIL: > Quantization-Aware Training (QAT) [...] allows preserving similar quality to bfloat16 wh...

me_bx • today at 8:12 PM • 0 replies • view on HN

TIL:

> Quantization-Aware Training (QAT) [...] allows preserving similar quality to bfloat16 while dramatically reducing the memory requirements to load the model

alt Hacker News