> LLMs alone aren't the way to AGI
Pretraining + RL works, there is no clear evidence that it doesn't scale further.
Pretraining + RL itself is the scaling limit. If you feed it the entire dataset before 1905, LLMs aren't going to come up with general relativity. It has no concept of physics, or time even.
AGI happens when you DON'T need to scale pertaining + RL.
Pretraining + RL itself is the scaling limit. If you feed it the entire dataset before 1905, LLMs aren't going to come up with general relativity. It has no concept of physics, or time even.
AGI happens when you DON'T need to scale pertaining + RL.