logoalt Hacker News

The Coming Need for Formal Specification

56 pointsby todsacerdotilast Saturday at 3:17 AM43 commentsview on HN

Comments

Animatslast Saturday at 4:56 AM

The trouble with formal specification, from someone who used to do it, is that only for some problems is the specification simpler than the code.

Some problems are straightforward to specify. A file system is a good example. The details of blocks and allocation and optimization of I/O are hidden from the API. The formal spec for a file system can be written in terms of huge arrays of bytes. The file system is an implementation to store arrays on external devices. We can say concisely what "correct operation" means for a file system.

This gets harder as the external interface exposes more functionality. Now you have to somehow write down what all that does. If the interface is too big, a formal spec will not help.

Now, sometimes you just want a negative specification - X must never happen. That's somewhat easier. You start with subscript checking and arithmetic overflow, and go up from there.

That said, most of the approaches people are doing seem too hard for the wrong reasons. The proofs are separate from the code. The notations are often different. There's not enough automation. And, worst of all, the people who do this stuff are way into formalism.

If you do this right, you can get over 90% of proofs with a SAT solver, and the theorems you have to write for the hard cases are often reusable.

show 7 replies
auggieroselast Saturday at 12:00 PM

I predict in 10 years „code“ that is not actually in the language of a proof assistant will be the exception, not the norm. This will happen by proof assistant languages becoming easier to the point that normal code will look cumbersome

victor_xuanlast Saturday at 10:52 AM

Formal specification only works once the system are in a relatively final shape.

No body wants to pay that price when they are struggling with product market fit.

show 1 reply
abrgrlast Saturday at 6:29 PM

Great article!

> There exists a higher level problem of holistic system behavior verification

This is the key observation. Strict, code-level verification of every line, while valuable in the small, doesn't contribute very meaningfully to understanding or building confidence in higher level system properties.

I see a future of modules with well defined interfaces and associated contracts. Module level contracts can be formally verified or confidently believed via property based testing. Higher level system behavior can then be proven or tested by assuming the module contracts are upheld.

Module boundaries allow for varying the desired level of confidence module by module (testing, property based testing with variable run time allotments, formal verification).

LLMs can handle each step of that chain today, allowing humans to invest specifically where we want more confidence.

Mikhail_Edoshinlast Saturday at 8:21 AM

Programming is very similar to inductive reasoning, you establish some facts, do a step that changes something, and establish facts after the step and then repeat that with different steps until you get the facts you are after. (See e.g. Manber's book.) If you merely set out to rigorously write the computation steps, the result will already be a proof sufficient for a human. I am not sure, but I think Dijkstra ended up doing or maybe just writing about something like that, he wanted to get programs that were correct by construction, not proven to be correct afterwards.

What is utterly lacking is a formal notation for these computation-reasoning steps that is not tied to a specific programming language. As a result what we get now is an informal specifications and a bunch of different final implementations for different languages. Whatever knowledge could be kept is not kept in any organized way.

show 2 replies
readthenotes1last Saturday at 4:32 AM

Some realized that building the tests was the more important part of writing the software long ago...

Test tables and scenarios may not cover all the things that can go wrong (a la Goëdel), but not doing them almost guarantees broken software.

show 1 reply
rramadasslast Saturday at 6:12 AM

Nada Amin's research mentioned in another thread is relevant here - https://news.ycombinator.com/item?id=46252034

show 1 reply
jackblemminglast Saturday at 4:55 AM

Whenever you look closely at what these proof nerds have actually built you typically find… nothing. No offense to them, it’s simply reality.

show 3 replies