Yes, they can. Some people like to parrot "next token prediction", "LLMs can only i...

charleshn • today at 10:58 AM • 1 reply • view on HN

Yes, they can.

Some people like to parrot "next token prediction", "LLMs can only interpolate", and other nonsense, but it is obviously not true for many reasons, in particular since we introduced RL.

Humans do not have the monopoly on generating novel ideas, modern AI models using post training, RL etc can come to them in the same way we do, exploration.

See also verifier's law [0]: "The ease of training AI to solve a task is proportional to how verifiable the task is. All tasks that are possible to solve and easy to verify will be solved by AI."

This applied to chess, go, strategy games, and we can now see it applying to mathematics, algorithmic problems, etc.

It is incredibly humbling to see AI outperform humans at creative cognitive tasks, and realise that the bitter lesson [1] applies so generally, but here we are.

[0] https://www.jasonwei.net/blog/asymmetry-of-verification-and-...

[1] http://www.incompleteideas.net/IncIdeas/BitterLesson.html

Replies

jdub • today at 12:21 PM

Reinforcement learning for "reasoning" perturbs the model to generate completions in a particular chain of thought / alternative selection structure. It's three next token predictors in a trench coat.

➕ show 1 reply

alt Hacker News

Replies