logoalt Hacker News

wrsyesterday at 10:57 PM1 replyview on HN

This is how early forms of "reasoning" in LLMs worked: just literally inserting words like "Wait...", "Hmm...", "Let me reconsider...", "But is it really..." into the token stream.


Replies

flexagoonyesterday at 11:23 PM

Is this not how current forms of reasoning work? It seems like the open models still output things like that, and the closed ones all just summarize their thinking instead to avoid distillation, but probably do the same thing internally.

show 1 reply