logoalt Hacker News

catigulalast Friday at 4:52 PM1 replyview on HN

Yes, they're purposely not 'trained on' chain-of-thought to avoid making it useless for interpretability. As a result, some can find it epistemically shocking if you tell them you can see their chain-of-thought. More recent models are clever enough to know you can see their chain-of-thought implicitly without training.


Replies

DenisMlast Friday at 5:35 PM

It is in their training set by now.