I hadn't seen SwiReasoning ( h...

hedgehog • today at 3:49 PM • 1 reply • view on HN

I hadn't seen SwiReasoning (https://swireasoning.github.io, paper and code), it looks like that works at generation time without any requirements on the model. It increases token-efficiency and accuracy, but at first skim it seems like this would be incompatible with multi-token prediction. For large reductions in token budget it could be worth it.

Replies

rafaquintanilha • today at 4:31 PM

Doesn't look like it's incompatible. Someone already released a quantization using MTP: https://huggingface.co/foxipanda/Rio-3.5-Open-397B-GGUF

➕ show 1 reply

alt Hacker News

Replies