The part that eludes me is how you get from this to the capability to debug arbitrary coding problem...

windowshopping • yesterday at 9:10 PM • 4 replies • view on HN

The part that eludes me is how you get from this to the capability to debug arbitrary coding problems. How does statistical inference become reasoning?

For a long time, it seemed the answer was it doesn't. But now, using Claude code daily, it seems it does.

Replies

mike_hearn • today at 10:25 AM

DNNs aren't really "statistical" inference in the way most people would understand the term statistics. The underlying maths owes much more to calculus than statistics. The model isn't just encoding statistics about the text it was trained on, it's attempting to optimize a solution to the problem of picking the next token with all the complexity that goes into that.

ferris-booler • yesterday at 10:29 PM

IMO your question is the largest unknown in the ML research field (neural net interpretability is a related area), but the most basic explanation is "if we can always accurately guess the next 'correct' word, then we will always answer questions correctly".

An enormous amount of research+eng work (most of the work of frontier labs) is being poured into making that 'correct' modifier happen, rather than just predicting the next token from 'the internet' (naive original training corpus). This work takes the form of improved training data (e.g. expert annotations), human-feedback finetuning (e.g. RLHF), and most recently reinforcement learning (e.g. RLVR, meaning RL with verifiable rewards), where the model is trained to find the correct answer to a problem without 'token-level guidance'. RL for LLMs is a very hot research area and very tricky to solve correctly.

fc417fc802 • yesterday at 10:26 PM

Because it's not statistical inference on words or characters but rather stacked layers of statistical inference on ~arbitrarily complex semantic concepts which is then performed recursively.

➕ show 1 reply

antonvs • today at 1:30 AM

One problem is that "statistical inference" is overly reductive. Sure, there's a statistical aspect to the computations in a neural network, but there's more to it than that. As there is in the human brain.

alt Hacker News

Replies