logoalt Hacker News

computerextoday at 6:26 AM0 repliesview on HN

They are all autoregressive. They have just been trained to emit thinking tokens like any other tokens.