logoalt Hacker News

samusyesterday at 11:01 AM0 repliesview on HN

There have been papers about introducing thinking tokens in intermediary layers that get stripped from the output.