> OpenAI has figured out RL. the models no longer speak english
What does this mean?
Interesting. This happens in Colossus: The Forbin Project (1970), where the rogue AI escapes the semantic drudgery of English and invents its own compressed language with which to talk to its Russian counterpart.
I think foremost it's a reference to this tweet https://x.com/karpathy/status/1835561952258723930.
The model learns to reason on its own. If you only reward correct results but not readable reasoning, it will find its own way to reason that is not necessarily readable by a human. The chain may look like English, but the meaning of those words might be completely different (or even the opposite) for the model. Or it might look like a mix of languages, or just some gibberish - for you, but not for the model. Many models write one thing in the reasoning chain and a completely different in the reply.
That's the nature of reinforcement learning and any evolutionary processes. That's why the chain of thought in reasoning models is much less useful for debugging than it seems, even if the chain was guided by the reward model or finetuning.