logoalt Hacker News

DoctorOetkeryesterday at 1:34 AM0 repliesview on HN

>I just don't see a reason to assume that an agent will be able to succeed at it and at the same time fail at other things that could make the whole exercise redundant.

But that is much simpler to understand: eventually finding a proof using guided search (machines searching for proofs, multiple inference attempts) takes more effort than verifying a proof. Formal verification does not disappear, because communicating a valuable succinct proof is much cheaper than having to search for the proof anew. The proofs will become inevitable lingua franca (like it is among capable humans) for computers as well. Basic economics will result in adoption of formal verification.

Whenever humans found an original proof, their notes will contain a lot of deductions that were ultimately not used, they were searching for a proof, using intuition gained in reading and finding proofs of other theorems. It's just that LLM's are similarily gaining intuition, and at some point become better than humans at finding proofs. It is currently already much better than the average human at finding proofs. The question is how long it takes until it gets better than any human being at finding proofs.

The future you see where the whole proving exercise (if by humans or by LLMs) becomes redundant because it immediately emits the right code is nonsensical: the frontier of what LLM's are capable of will move gradually, so for each generation of LLMs. there will always be problems it can not instantly generate provably correct software (but omitting the according-to-you-unnecessary proof). That doesn't mean they can't find the proofs, just that it would have to search by reasoning, with no guarantee if it ever finds a proof.

That search heuristic is Las Vegas, not Monte Carlo.

Companies will compare the levelized operating costs of different LLM's to decide which LLMs to use in the future on hard proving tasks.

Satellite data centers will consume ever more resources in a combined space/logic race for cryptographic breakthroughs.