Huh, I wonder if that's why you cannot change the temperature when thinking is enabled. Do you have a link for the paper?
https://transformer-circuits.pub/2026/emotions/index.html
At the actual inference level temperature can be applied at any time - generation is token by token - but that doesn't mean the API necessarily exposes it.
https://transformer-circuits.pub/2026/emotions/index.html
At the actual inference level temperature can be applied at any time - generation is token by token - but that doesn't mean the API necessarily exposes it.