logoalt Hacker News

justboy1987today at 2:51 AM1 replyview on HN

You're touching on the fundamental "who watches the watchmen" problem in formal verification. But I think the framing slightly misses the key asymmetry: reviewing a Lean4 spec is dramatically easier than reviewing the implementation it constrains.

A formal spec in Lean is typically 10-50x shorter than the code it proves correct. More importantly, Lean's type checker is itself a small, trusted kernel (~10k lines) that has been scrutinized by the PL community for years. So you're not trusting the agent — you're trusting the kernel.

The practical workflow isn't "agent writes spec + code." It's: human writes spec (the hard creative part), agent generates proof that code satisfies spec, Lean kernel mechanically checks the proof. The agent can hallucinate all it wants in step 2 — if the proof doesn't typecheck, it gets rejected deterministically.

The real bottleneck is step 1: writing good specs requires domain expertise. But that's exactly where humans should stay in the loop. It's a much better division of labor than reviewing thousands of lines of generated code.


Replies

wazHFsRytoday at 7:32 AM

Does that mean your production code is lean? Or do you translate some other language code to lean to verify it?

show 1 reply