So basically the quantization in a byteshape model is per-tensor and can be variable and is an "...

tgtweak • last Wednesday at 3:47 AM • 1 reply • view on HN

So basically the quantization in a byteshape model is per-tensor and can be variable and is an "average" in the final result? The results look good - curious why this isn't more prevalent! Would also love to better understand what factors into "accuracy" since there might be some nuance there depending on the measure.

Replies

kouteiheika • last Wednesday at 4:54 AM

> Would also love to better understand what factors into "accuracy" since there might be some nuance there depending on the measure.

It's accuracy across GSM8K, MMLU, IFEVAL and LiveCodeBench.

They detail their methodology here: https://byteshape.com/blogs/Qwen3-4B-I-2507/

alt Hacker News

Replies