How is the research on training these models directly in their quantized state going? That'll...

londons_explore • today at 7:32 AM • 1 reply • view on HN

How is the research on training these models directly in their quantized state going?

That'll be the real game changer.

Replies

The original BitNet was natively trained on 1.58 bits. PrismML has not released any actual info on how they trained, but since they are based on Qwen, there was certainly some downstream quantization involved.

➕ show 1 reply

alt Hacker News

Replies