The way to understand Arthur Whitney's C code is to first learn APL (or, more appropriately, one of his languages in the family). If you skip that part, it'll just look like a weirdo C convention, when really he's trying to write C as if it were APL. The most obvious of the typographic stylings--the lack of spaces, single-character names, and functions on a single line--are how he writes APL too. This is perhaps like being a Pascal programmer coming to C and indignantly starting with "#define begin {" and so forth, except that atw is not a mere mortal like us.
IMO this is a really good blog post, whatever you think of the coding style. Great effort by the author, really good for eight hours' work (as mentioned), and some illuminating conclusions: https://needleful.net/blog/2024/01/arthur_whitney.html#:~:te...
I was curious about Shakti after reading this and the comments, so followed the link to shakti.com on Wikipedia. It seems it now redirects to the k.nyc domain, which displays a single letter 'k'.
I wondered if I was missing something, so looked at the source, to find the following:
<div style='font-family:monospace'>k
Nothing but that. Which is, surely, the HTML equivalent of the Whitney C style: relying on the compiler/interpreter to add anything implicit, and shaving off every element that isn't required, such as a closing tag (which, yes, only matters if you're going to want something else afterwards, I guess...). Bravo.Reminds me of Bourne's attempt at beating C into Algol: https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
Example: https://www.tuhs.org/cgi-bin/utree.pl?file=V7/usr/src/cmd/sh...
There are best or accepted practices in every field.
And in every field they work well for the average case, but are rarely the best fit for that specific scenario. And in some rare scenarios, doing the opposite is the solution that fits best the individual/team/project.
The interesting takeaway here is that crowd wisdom should be given weight and probably defaulted if we want to turn off our brains. But if you turn on your brain you will unavoidably see the many cracks that those solutions bring for your specific problem.
Kudos on not just taking a combative stance on the code!
This was a very fun read that I'm fairly convinced I will have to come back to.
This is a good use of macros. I understand people are frightened by how it looks but it’s just C in a terse, declarative style. It’s mostly straightforward, just dense and yes - will challenge you because of various obscure macro styles used.
I believe “oo” is probably an infinity error condition or some such not 100% sure. I didn’t see the author discuss it since they said it’s not used. Was probably used during development as a debug printout.
Much as a Real Programmer can write FORTRAN programs in any language, Whitney can write APL programs in any language.
```
#define _(e...) ({e;})
#define x(a,e...) _(s x=a;e)
#define $(a,b) if(a)b;else
#define i(n,e) {int $n=n;int i=0;for(;i<$n;++i){e;}}
```
>These are all pretty straight forward, with one subtle caveat I only realized from the annotated code. They're all macros to make common operations more compact: wrapping an expression in a block, defining a variable x and using it, conditional statements, and running an expression n times.
This is war crime territory
TIL `a ?: b`, that's actually pretty nice, a bit like Haskell's `fromMaybe b a` (or `a <|> b` if b can also b "empty")
and I do like `#define _(e...) ({e;})` – that's one where I feel the short macro name is OK. But I'd like it better if that were just how C worked from the get-go.
Very nice discussion at the end of the article. There are good things to be learnt from this code and its discussions even if you disagree with some or even most of the style.
All great industrial apps are DSLs for specific domains, because often time end users are much smarter & craftier than developers. Some great examples: - AutoCad (vector drawing DSL on top of Lisp) - Mathematica (symbolic algebra DSL - Lisp & C) - Aspen One (Thermodynamics/Chemistry DSL on FORTRAN) - COMSOL (Multiphysics DSL C++) - Verilog (FPGA design DSL C) and also general purpose tools like Regex, XLA, CERN/Root, SQL, HTML/CSS,...
Is this supposed to be a specific coding style or paradigm?
I’ve never seen code written like this in real-world projects — maybe except for things like the "business card ray tracer". When I checked out Arthur Whitney’s Wikipedia page I noticed he also made the J programming language (which is open source) and the code there has that same super-dense style https://github.com/jsoftware/jsource/blob/master/jsrc/j.c
I can’t explain why but “He’s assigning 128 to a string called Q” made me absolutely lose it.
The macros are fine as concept, i've used something similar before for reducing code size,e.g. defining hundreds of similar functions and stuff. What is incomprehensible and puts the entire thing into "Obfuscated C" territory is one-letter variables. You'll need to memorize all of them and can't reuse them in normal code. If at least the variables were self-descriptive i'd support such coding style, but it clearly need comments.
“would you rather spend 10 days reading 100,000 lines of code, or 4 days reading 1000?"
More like 10 days understanding 100K loc or 30 days stabbing yourself in the eye over 4K loc
The C preprocessor allows you to define a limited DSL on top of C. This is... sometimes a good thing, and often convenient, even if it makes it hard to understand.
Nice. Previous attempts by other users to decode Whitney's style of C programming can be found here - https://news.ycombinator.com/item?id=38889148
The stated reason Whitney does this is - https://news.ycombinator.com/item?id=32202742
Nice write up!
When I see stuff like this, personally, I don't try to understand it, as code like this emerges from basically three motivations:
- The other person wanted to write in some other more (functional|object oriented|stack) language but couldn't, so they did this.
- The person couldn't be bothered to learn idioms for the target language and didn't care about others being able to read the program.
- The person intentionally wanted to obfuscate the program.
And none of these are good reasons to write code in a particular way. Code is about communication. Code like this is the equivalent to saying "I know the grammatical convention in English is subject-verb-object but I feel like speaking verb-object-subject and other people will have to just deal with it"—which, obviously, is a horrible way to communicate if you actually want to share ideas/get your point across.
That all said, the desire to have logic expressed more compactly and declaratively definitely resonates. Unfortunately C style verbosity and impurity remains dominant.
there's the java version too
https://github.com/KxSystems/javakdb/blob/8a263abee29de582cd...
It’s cool that you can do this in C! And it’s cool that this article explores that.
As developers we have to decide where and when this makes sense, just like with other language features, libraries, architectural patterns, etc.
This reminds me of when I was learning perl.
At first, I thought it looked like line noise. $var on the left of the = sign? Constructs like $_ and @_? more obscure constructs were worse.
But I had to keep going and then one day something happened. It was like one of those 3d stereograms where your eyes have to cross or uncross. The line noise became idioms and I just started becoming fluent in perl.
I liked some of it too - stuff like "unless foo" being more a readable/human of saying if not foo.
perl became beautiful to me - it was the language I thought in, and at the highest level. I could take an idea in my mind and express it in perl.
But I had some limits. I would restrain myself on putting entire loops or nested expression on one line just to "save space".
I used regular expressions, but sometimes would match multiple times instead of all in one giant unreadable "efficient" expression.
and then, I looked at other people's perl. GAH! I guess other people can "express themselves in perl", but rarely was it beautiful or kind, it was statistically worse and closer to vomit.
I like python now. more sanity, (somewhat) more likely that different people will solve a problem with similar and/or readable code.
by the way, very powerful article (even if I intensely dislike the code)
The person who wrote this code might be a genius, but learning to read it isn’t going to make anyone smart. It’s basically obfuscated assembly code.
For the APL fans (or haters) Unicomp makes keycaps with APL symbols for their (excellent) Model M mechanical keyboards.
Reminds me of a Python codebase I used to work with
The company was originally a bunch of Access/VB6 programmers.
Then they wrote their VB code in PHP.
And then they wrote their PHP code in Python. It was disgusting.
> His languages take significantly after APL, which was a very popular language for similar applications before the invention of (qwerty) keyboards.
Ok, so this article is tongue in cheek. Good to know that up front.
> "Opinions on his coding style are divided, though general consensus seems to be that it's incomprehensible."
I wholeheartedly concur with popular opinion. It's like writing a program in obfuscated code.
Hmmm... his way of basically making C work like APL made me wonder: Is there a programming language out there that defines its own syntax in some sort of header and then uses that syntax for the actual code?
Ah yes... very tempting to ask an AI to refactor some large Java program (pick your language) "in the style of Arthur Whitney".
This man casually codes up IOCCC entries.
The code registers a bit like FORTH in concept.
Holy molly this must be the equivalent of reading the necronomicon and getting cosmic madness disease as a result.
What a flex of patience!
During code reviews I would always ask for clear code because it's much harder to tell whether it's correct if it's unclear.
I got too much other stuff to do than decode the voynich manuscript...
As a very long time C programmer: don't try to be smart. The more you rely on fancy preprocessor tricks the harder it will be to understand and debug your code.
The C preprocessor gives you enough power to shoot yourself in the foot, repeatedly, with anything from small caliber handguns to nuclear weapons. You may well end up losing control over your project entirely.
One nice example: glusterfs. There are a couple of macros in use there that, when they work are magic. But when they don't you lose days, sometimes weeks. This is not the way to solve coding problems, you only appear smart as long as you remember what you've built. Your other self, three years down the road is going to want to kill the present one, and the same goes for your colleagues a few weeks from now.
Kerrnigan’s law seems to apply:
Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?
The same article is available under “I read Arthur Whiteney's code and all I got was Mental Illness”, which is apt.
This parades all the reasons why you may want to avoid C like the plague, and then some. This stuff gives me nightmares.
This code style is psychotic. I had to reverse-engineer and verify a C codebase that was machine-obfuscated and it was still clearer to follow than this. Increasing clarity through naming is great, but balancing information density is, dare I say, also a desirable goal. Compacting code rapidly diminishes returns once you're relying on a language having insignificant whitespace.
HN stories about Whitney's code tend to predictably attract a lot of comments about the coding style, so I thought I'd share a couple of positive discussions from previous related posts.
Here's one from one of my favourite HN commenters posted at https://news.ycombinator.com/item?id=25902615#25903452 (Jan 2021):
"Its density is many times higher than most C programs, but that's no big obstacle to understanding if you don't attempt to "skim" it; you need to read it character-by-character from top to bottom. It starts off defining some basic types, C for Character and I for Integer, and then the all-important Array. This is followed by some more shorthand for printf, return, and functions of one and two arguments, all of the array type. The DO macro is used to make iteration more concise. Then the function definitions begin. ma allocates an array of n integers (this code is hardcoded for a 32-bit system), mv is basically memcpy, tr (Total Rank?) is used to compute the total number of elements, and ga (Get/Generate Array) allocates an array. This is followed by the definitions of all the primitive operations (interestingly, find is empty), a few more globals, and then the main evaluator body. Lastly, main contains the REPL. While I don't think this style is suitable for most programmers, it's unfortunate that the industry seems to have gone towards the other extreme." -- userbinator
Here's another from the same commenter on a different story at https://news.ycombinator.com/item?id=39026551#39038364 (Jan 2024):
"There's something very satisfying about how this style seems to "climb the abstraction ladder" very quickly, but all of those abstractions he creates are not wasted and immediately put to use. I think much of the amazement and beauty is that there isn't much code at all, and yet it does so much. It's the complete opposite of the bloated, lazy, lowest-common-denominator trend that's been spreading in many other languages's communities." -- userbinator
Another from the story at https://news.ycombinator.com/item?id=40544283#40544491 (Jun 2024):
"For people not accustomed to the style of Whitney, you can read various HN threads from the past to learn more about why he writes programs the way he does. It's deliberate and powerful." -- hakanderyal
One more from the same story at https://news.ycombinator.com/item?id=40544283#40545004 (Jun 2024):
"Whitney is famous for writing code like this, it's been his coding style for decades. For example, he wrote an early J interpreter this way in 1989. There's also a buddy allocator he wrote at Morgan Stanley that's only about 10 lines of C code." -- papercrane
This style is inherently worse because there's no spaces. My brain has been wired since 4 years old to read words, not letters. Words are separated by spaces. Havingnospacesbetweenwordsmakesthemexponentiallyhardertoreadandcomprehend.
Obfuscation is usually just a lack of accountability, and naive job security through avoiding peer-review.
Practically speaking, if people can't understand you, than why are you even on the team? Some problems can't be solved alone even if you live to a 116 years old.
Also, folks could start dropping code in single instruction obfuscated C for the lols =3
the obsession with code elegance vs shipping velocity is telling here. Whitney's style works for him because he's building tools he'll maintain himself for decades. same product, same developer, same context.
most startups are in the opposite situation. you need three different engineers to understand what you built last quarter because two people quit and one went to a different team. your clever abstractions become technical debt when the person who made them isn't around to explain them.
here's the real question: are you optimizing for the code or the business? sometimes boring, verbose, googleable patterns beat clever compression because your constraint isn't keystrokes - it's hiring, onboarding, and velocity when half your team is new. that's startup reality.
[dead]
[dead]
I don't writing code like that will make the average programmer team any faster. Unless you are really deep into the code and have a good mental model of how the symbols are structured it think its going to take longer with the constant need to refer back/ re work out what a symbol means. I'd rather have the descriptive variable names. What he writes looks akin to minified JS to me.
They're all macros to make common operations more compact
I read the J Incunabulum before encountering this, and the point that stands out is that you don't start by jumping into the middle of it like many programmers who are familiar with C will do; the macros defined at the beginning will confuse you otherwise. They also build upon previous ones, so the code ends up climbing the "abstraction ladder" very quickly. I personally like the Iterate macro (i), for how it compresses a relatively verbose loop into a single character; and of course in an array language, the iteration is entirely implicit.
In other words, I believe the reason this code is hard to read for many who are used to more "normal" C styles is because of its density; in just a few dozen lines, it creates many abstractions and uses them immediately, something which would otherwise be many many pages long in a more normal style. Thus if you try to "skim read" it, you are taking in a significantly higher amount of complexity than usual. It needs to be read one character at a time.
As someone who has spent considerable time working with huge codebases composed of hundreds of tiny files that have barely any substance to them, and trying to find where things happen becomes an exercise in search, this extreme compactness feels very refreshing.