Usually it’s not a different model, it’s the same model with different inference-time settings. “Thi...

Yahyaaa • last Monday at 1:29 PM • 0 replies • view on HN

Usually it’s not a different model, it’s the same model with different inference-time settings. “Thinking effort” typically changes the compute budget and decoding behavior (how many steps, how much exploration, sometimes internal planning loops).

Some stacks also tie it to orchestration layers or system/prompt signals, which is why it can look inconsistent across products

alt Hacker News