What even is "literate programming"? (2024)

58 points • by joecobb • last Tuesday at 7:56 PM • 33 comments • view on HN

Comments

This essay seems to be missing the main primary references for literate programming:

https://www.cs.tufts.edu/~nr/cs257/archive/literate-programm...

https://www-cs-faculty.stanford.edu/~knuth/lp.html

Knuths intention seems clear enough in his own writing:

Literate programming is a methodology that combines a programming language with a documentation language, thereby making programs more robust, more portable, more easily maintained, and arguably more fun to write than programs that are written only in a high-level language. The main idea is to treat a program as a piece of literature, addressed to human beings rather than to a computer.

and

Let us change our traditional attitude to the construction of programs: Instead of imagining that our main task is to instruct a computer what to do, let us concentrate rather on explaining to human beings what we want a computer to do.

➕ show 3 replies

dandersch • today at 10:52 AM

Couple things that helped me understand literate programming:

- A literate program has code and documentation interleaved in one file.

- Weaving means extracting documentation and turning it into e.g. a pdf.

- Tangling means extracting code in a form that is understandable to a compiler.

A crucial thing to actually make this paradigm useful is the ability to change around the order of your code snippets, i.e. not letting the compiler dictate order. This enables you to code top-down/bottom-up how ever you see fit, like the article mentioned. My guess on why people soured on literate programming is that their first introduction involved using tools that didn't have this ability (e.g. jupyter notebooks). Also, you usually lose a lot of IDE features: no go-to-definition, bad auto-complete, etc.

IMO, the best tool that qualifies for proper literate programming is probably org-mode with org-babel. It's programming language agnostic, supports syntax highlighting and noWEB for changing around order. Of course it requires getting into the Emacs ecosystem, so it's destined to stay obscure.

➕ show 2 replies

forgotpwd16 • today at 1:37 PM

An interesting project I stumbled upon recently is AirLoom[0], essentially a reverse literate programming tool. Rather having code and prose interweaved (either Knuth-style code-within-prose or doc-style/as-comments prose-within-code), you've them split in dedicated in segment-annotated code and prose referencing those segments. AirLoom can then produce a combined document with references replaced by the actual code segments. This allows using a normal programming environment (not possible in first approach) and being order independent (not possible in second approach).

[0]: https://github.com/eudoxia0/airloom

➕ show 2 replies

tehologist • today at 6:20 PM

Literate programming is alive and well in 2025.

https://leo-editor.github.io/leo-editor/

https://kaleguy.github.io/leovue/#/t/2

https://ganelson.github.io/inweb/inweb/index.html

Inform 7 is arguably one of the largest programs ever written in literate style.

Jtsummers • today at 6:26 PM

Perhaps the most prominent example of literate programming missed by the author: https://www.pbrt.org/ Physically Based Rendering by Pharr, Jakob, and Humphreys.

Responding directly to a couple things the author wrote:

> When programming, it’s not uncommon to write a function that’s “good enough for now”, and revise it later. This is impossible to adequately do in literate programming.

It's not impossible in literate programming. There's nothing about LP that impedes this, I do it all the time. I have a quick obvious implementation (perhaps a naive recursive solution) and throw it in to get things working. I revisit it later when I need to make that naive recursive one faster (memoization, DP, or just another algorithm all together). It's no harder than what I'd do with an ordinary approach to programming.

> Unit testing is not supported one bit in WEB, but you can cobble something together in CWEB.

WEB was designed for use with Pascal and CWEB for C and C++. At the time the tools were developed, "unit testing" as it means today was not really a widespread thing. Use other tools if you find that WEB is impeding your use of unit tests in your Pascal programs. With other tools (org-mode and org-babel are what I use), it's easy to do. Like with writing good enough functions, you just do it, and it's done. You write a unit test in a block of code and when it gets tangled you execute your unit tests. This can be more cumbersome in some languages than with others, but in Python it's as easy as:

  #+BEGIN_SOURCE python :noweb yes :tangle test/test_foo.py
    from hypothesis import ...
    from pytest import ...
    <<name_of_specific_test>>
    <<name_of_other_test>>
  #+END_SOURCE

  #+NAME: name_of_specific_test
  #+BEGIN_SOURCE
    def test_frob(...):
        ...
  #+END_SOURCE

When I used LP regularly I had a little script I wrote that would tangle source from my org files, and because I had the names and paths specified everything would end up in the right place. This is followed by running `pytest` (or whatever test utility) as normal. I used this in makefiles and other scripts. This is only slightly harder than the normal approach, but not hard. I added a `tangle` step into my build and test process and it was good to go.

If your unit test system requires more ceremony then you'll need to include that as well, but you'd have to include that in your conventionally written code as well.

WillAdams • today at 5:17 PM

There are lot of texts which were left out of this post --- I've been trying to collect literate programs published as books here:

https://www.goodreads.com/review/list/21394355-william-adams...

Not sure where the author got the contention that there are only a few tools for literate programming --- it's a straight-forward enough task that many programmers do this --- heck, even I managed to (w/ a bit of help on tex.stackexchange): https://github.com/WillAdams/gcodepreview/blob/main/literati... --- if it were more complex, and wasn't so implementation-specific (filenames need to be specified in multiple places), I'd write it up as a Literate Program and put it up on CTAN as a package.

One classic bit of advice for writing is, ‘It is perfectly okay to write garbage as long as you edit brilliantly.’ --- the great thing about a Literate Program is that it makes the act of editing far simpler, which has made feasible every program I've ever written which got past the 1K lines mark --- including an AppleScript for InDesign which Olav Martin Kvern, then the "Scripting Evangelist" for Adobe Systems declared to be impossible (my boss had promised a system for creating a four-level deep index from XML embedded in the text of pages in an InDesign document, while OMK averred that it was impossible to create an index entry for more than the main level of the index --- one has to have code which tracks the existence of an entry at each level of the index and where it does not exist, starting at the top-level, insert it, then work down and add the sub-index-entry to the index-entry it is beneath).

stingraycharles • today at 9:17 AM

To me the definition of literate programming is much less interesting than the spirit: for complicated logic / parts of code, I try to take the reader through the whole top-down plan / approach, as if it’s a story I’m writing to my colleagues about what’s going on and why. In those parts of code I can easily have 10 times as much lines of comments than code, but it’s important to use it sparingly: people tend to start to ignore comments if they’re low value. But it’s much more effective to have good comments than external documentation, as external documentation has a tendency to go out sync with the code.

As with most things, don’t be dogmatic.

➕ show 1 reply

rgreeko42 • today at 3:52 PM

Isn't Org Mode and your LISP of choice the ideal literate programming environment? I'm surprised REPL-based LISP isn't mentioned at all.

nerdypepper • today at 2:49 PM

relatedly, i have been using literate haskell to document my advent of code journey this year:

- day 5's solution for example: https://aoc.oppi.li/2.3-day-5.html#day-5

- literate haskell source: https://tangled.org/oppi.li/aoc/blob/main/src/2025/05.lhs

the book/site is "weaved" with pandoc, the code is "tangled" with a custom markdown "unlit" program that is passed to GHC.

svilen_dobrev • today at 12:09 PM

well, maybe it is everything that is not "illiterate programming", i.e. "programming-without-understanding".. which decade by decade gets more and more abundant/dominating.

i do similar thing which i call live-sketching.. a mostly-no-content python namespace-hierarchy of module(s) and classes (used as just namespace holders), and then add (would-do-somehing) "terminal" methods, and combine-those-into-flows actual "procedures" methods , here and there .. until the "communication" diagram starts appear out of it, and week after week, fill the missing parts. It feels like some way of writing executable spec over imagined/fake stuff, and slowly replacing the fakes with reals. Some parts never get filled. Others are replaced with big-external-pieces - as-long-as matching the spec needed. What's left is written by hand.. and all this maybe multiple cycles.

This approach allows for both keeping the knowledge of what the system should do - on the spec / hierarchical level - and freedom to leave things undone, plug some external monster, or do-it-yourself as one sees fit. The downside is that the plumbing between pieces might be bigger/messier than the pieces - if you have ever seen the spiderweb of wires above a breadboard with TTL ICs..

e.g. for my Last project - re-engineering a multiple-aging-variants of kiosk-system into coherent single codebase that can spawn each/most of the previous - took me 6 months to turn a zoo of 20x 25KLoc into single 20Kloc +- 5 for the specializations - and the code-structure still preserves the initial split-of-concerns (some call it architecture), and comms "diagram", who talks to who when/why.

But yeah, it's not for faint-hearted, and there little visibility of the amount of work going/done, as the structure at day 1 is more or less the structure at day 181, and management may decide to see only that..

zupatol • today at 5:46 PM

Another successful example of literate programming is fastHTML, and probably most of the code written at fast.ai and answer.ai. https://fastht.ml/docs/

Here's Jeremy Howard explaining why he loves doing everything in notebooks: https://www.youtube.com/watch?v=9Q6sLbz37gk

machino • today at 12:49 PM

I’ve inherited some CWEB code from a colleague. My interpretation is that you write it like stream of consciousness, interleaving thinking and chucks of code. Not all code your write ends up in the final C file.

However, the final effect is spaghetti code (you can surrogate “goto” by injecting code in different locations.) And docs are hard to read.

But, it really forces you to explain what you do and how you got there, which is incredibly useful for reconstructing history. (Theirs is also a sort of diff file for it, I think with .ch extension, to amend files.)

loa_in_ • today at 9:19 AM

The examples are definitely acknowledgement worthy.

I imagine the biggest hurdle on the path towards adopting this is writing down clear, readable prose using highly technical language. And naming things. Using ambiguous human language to describe a complex algorithm without causing a conflict in a big team.

zkmon • today at 3:40 PM

You are beating around a bush of nothing.

js8 • today at 11:24 AM

Maybe I am weird, but I would like to see/program in a formal, yet fuzzy/modal language, which could serve as a metalanguage that describes (documents) the program. This metalanguage must have some kind of constructs to describe unknown things, or things that are deliberately simplified in favor of exposition. So basically eschew natural language completely in favor of fully formalized description, that could be manipulated programmatically.

However, I don't know what this metalanguage should be. I don't know how to translate typical comments (or a literate program) into some sort of formal language. I think we have a gap in philosophy (epistemology).

➕ show 1 reply

npodbielski • today at 11:12 AM

Maybe it will be unpopular opinion but if your idea has to be explained after 50 years in a blog post maybe it was not that good after all. Or maybe idea was good but state of the tools and culture of your field is not best place to implement it, like the blog post ask: what tool you would use for literate programming? Or you need to write a tool for literate programming first? For me it sounds bit like runnable python notebook, which is great for DevOps stuff but not really for developing financial system. And I do not want to start about lack of tests as author states.

antiquark • today at 2:20 PM

I seriously looked into it many years ago...

One problem with "literate programming" is it assumes that good coders are also good writers, and the good writers are also good coders.

Another problem is that the source files for the production code will have to be "touched" for documentation changes. Which IMHO is an absolution no-no for production code. Once the code has been validated, no more edits! If you want to edit docs, go ahead, just don't edit the actual source.

lincpa • today at 10:28 AM

[dead]

alt Hacker News

What even is "literate programming"? (2024)

Comments