The post says this in other words: in Lean, Rocq, or any other theorem prover, you get a formally-ve...

enum • today at 11:14 AM • 3 replies • view on HN

The post says this in other words: in Lean, Rocq, or any other theorem prover, you get a formally-verified proof, but you do NOT get a formally verified theorem statement.

So, even if the proof is correct, you need to determine if the theorem is what you want. Making that determination requires expertise. Since you cannot "run the theorem", you cannot vibe-code your way through it. E.g., there is no equivalent of "web app seems to be working!" You have to actually understand what the theorems are saying in a deep way.

Replies

SkiFire13 • today at 12:04 PM

Even then this seems much more promising to me compared to other areas. Writing theorem statements is much much easier than coming up with proofs so it's not a big deal if a human has to do that. And once that's done getting a correct proof out of an LLM/AI model can be done fully automatically (assuming you do get a proof out of it though!)

➕ show 1 reply

koito17 • today at 12:19 PM

In the case of Lean, propositions are encoded as dependent types and the user typically has to encode that themselves then make use of e.g. tactics to derive a term of said type.

Writing a statement you don't understand then later getting a proof from an LLM doesn't seem all that useless to me; in my mind, it could still be useful as exploration. Worst case scenario: you encoded a tautology and the LLM gave you a trivial proof. Best case scenario: the proposition ends up being a lemma for something you want to prove.

I do think there is a kernel of truth in what you've stated: if the user does not actually understand the statement of a proposition, then the proof is not very useful to them, since they don't know what the statement's truthiness implies. As someone who used to do mathematics, I still find immense value in vibe-coding away mundane computations, similar to what Lean's `simp` tactic already does, but much more broad.

➕ show 1 reply

alt Hacker News

Replies