logoalt Hacker News

oceanskytoday at 3:13 PM1 replyview on HN

Yes. It's post training in qwen using the novel SwiReasoning framework.


Replies

hedgehogtoday at 3:49 PM

I hadn't seen SwiReasoning (https://swireasoning.github.io, paper and code), it looks like that works at generation time without any requirements on the model. It increases token-efficiency and accuracy, but at first skim it seems like this would be incompatible with multi-token prediction. For large reductions in token budget it could be worth it.

show 1 reply