logoalt Hacker News

motoboitoday at 3:45 AM1 replyview on HN

The output is part of context. The model reason but also output tokens. Force it to respond in an unfamiliar format and the next token will veer more and more from the training distribution, rendering the model less smart/useful.


Replies

nearbuytoday at 2:38 PM

It won't matter. By the time it's done reasoning, it has already decided what it wants to say.

Reasoning tokens are just regular output tokens the model generates before answering. The UI just doesn't show the reasoning. Conceptually, the output is something like:

  <reasoning>
    Lots of text here
  </reasoning>
  <answer>
    Part you see here. Usually much shorter.
  </answer>