I have been following recent progress in the formalization of mathematical proofs in Lean, particularly in the context of large language models. One prominent advocate of this approach is Terence Tao, who regularly writes about developments in this area.
From a programmer's perspective, this puts up an interesting parallel. Models such as Sonnet or Opus 4.5 can generate thousands of lines of code per hour. I can review the output, ask the model to write tests, iterate on the result, and after several cycles become confident that the software is sufficiently correct.
For centuries, mathematicians developed proofs by hand, using pen and paper, and were able to check the proofs of their peers. In the context of LLMs, however, a new problem may arise. Consider an LLM that constructs a proof in Lean 4 iteratively over several weeks, resulting in more than 1,000,000 lines of Lean 4 code and concluding with a QED. At what point is an mathematician no longer able to confirm with confidence that the proof is correct?
Such a mathematician might rely on another LLM to review the proof, and that system might also report that it is correct. We may reach a stage where humans can no longer feasibly verify every proof produced by LLMs due to their length and complexity. Instead, we rely on the Lean compiler, which confirm formal correctness, and we are effectively required to trust the toolchain rather than our own direct understanding.
This looks really good and complete. And I like that it starts from normal programming stuff rather than immediately jumping into type universes and proof expressions.
Edit: some feedback on the monads page:
1. "The bind operation (written >>=) is the heart of the monad." .. but then it doesn't use >>= at all in the code example.
2. "The left arrow ← desugars to bind; the semicolon sequences operations." .. again, no semicolons in the code example.
3. I don't really understand the List Monad section. The example doesn't seem to use any monad interfaces.
4. "The same laws look cleaner in the Kleisli category, where we compose monadic functions directly. If f:A→MB and g:B→MC, their Kleisli composition is g∘f:A→MC" - it was going so well!
The motivation itself is quite fresh and compelling even as a standalone article.
https://sdiehl.github.io/zero-to-qed/02_why.html