A model solving original math problems may look like human reasoning, but internally the model is ch...

figarus314 • today at 4:29 PM • 4 replies • view on HN

A model solving original math problems may look like human reasoning, but internally the model is choosing the next token based on what it has learned about probability around various patterns and structures. The model knows about correlations between problems, proof techniques and answer structures, and when it "reasons" it's selecting a high probability trajectory through that learned knowledge.

A calculator is different because it is not probabilistic; it executes a fixed procedure. One of these models, when doing math, is more like a learned probabilistic system that understands enough structure around mathematics that some of its high probability trajectories seem like genuine reasoning.

The difference is that when a human reasoner goes to solve a problem, they'll think "this kind of proof usually goes this way" - following an explicit rule enforcement. The model may produce the same output, and may even appear to approach it the same way, but the mechanism is a probabilistic pattern selection rather than explicit rule enforcement.

Replies

visarga • today at 5:56 PM

You talk as if problem solving is a supervised (imitation) learning problem. No, it is a reinforcement learning problem, models learn by solving problems and getting rated. They generate their own training data. Optimal budget allocation is 1/3 cost pre-training, 1/3 for RL, and 1/3 on inference.

XMPPwocky • today at 5:22 PM

> The difference is that when a human reasoner goes to solve a problem, they'll think "this kind of proof usually goes this way" - following an explicit rule enforcement.

How is this different from "probabilistic pattern selection"?

➕ show 1 reply

senordevnyc • today at 5:44 PM

I don’t think there’s any evidence that “human reasoning” isn’t also based on probabilistic pattern selection.

ieieue • today at 5:17 PM

It’s amazing simple things have to be reiterated.

Perhaps it’s best if most admit they don’t have the fundamental ways of thinking to even participate in the conservation.

When all nuance is lost, the discussion must end.

alt Hacker News

Replies