> It's how LLMs can do things like translate from language to language
The heavy lifting here is done by embeddings. This does not require a world model or “thought”.
LLMs are compression and prediction. The most efficient way to (lossfully) compress most things is by actually understanding them. Not saying LLMs are doing a good job of that, but that is the fundamental mechanism here.
LLMs are compression and prediction. The most efficient way to (lossfully) compress most things is by actually understanding them. Not saying LLMs are doing a good job of that, but that is the fundamental mechanism here.