The reasoning is the secret sauce. They don't output that. But to let you have some feedback about what is going on, they pass this reasoning through another model that generates a human friendly summary (that actively destroys the signal, which could be copied by competition).
Don't or can't.
My assumption is the model no longer actually thinks in tokens, but in internal tensors. This is advantageous because it doesn't have to collapse the decision and can simultaneously propogate many concepts per context position.