First, thanks for sharing this link, it was an interesting read! A few remarks below.
I had a hard time reading the wc code in the article. First I had to go to the GitHub to understand that "da" stands for dynamic array, and then understand that what the author calls wc is not at all the wc linux commands, which by default gives you the number of lines, words, and characters in a file, not the count of occurrences of each word in the file, which is what the proposed code does.
Also, since I had to read the GitHub README, another remark: it says that sp_io uses pthreads rather than fork and exec. Both of those approach (but especially pthreads) are contradictory to the explicit goals of programming against lowest level interfaces. I believe the lowest level syscall is clone3 [1], which gives you more fine grained control on what is shared between the parent and child processes, allowing to implement fork or threads.
[1] https://manpages.debian.org/trixie/manpages-dev/clone3.2.en....
I agree with most of the criticisms they make.
I agree that pointer and length is better than null-terminated strings (although it is difficult in C, and as they mention you will have to use a macro (or some additional functions) to work this in C).
Making the C standard library directly against syscalls is also a good idea, although in some cases you might have an implementation that needs to not do this for some reason, generally it is better for the standard library directly against syscalls.
FILE object is sometimes useful especially if you have functions such as fopencookie and open_memstream; but it might be useful (although probably not with C) to be able to optimize parts of a program that only use a single implementation of the FILE interface (or a subset of its functions, e.g. that does not use seeking).
> Principles
> Be extremely portable
> sp.h is written in C99, and it compiles against any compiler and libc imaginable. It works on Linux, on Windows, on macOS. It works under a WASM host. It works in the browser. It works with MSVC, and MinGW, it works with or without libc, or with weird ones like Cosmopolitan. It works with the big compilers and it works with TCC.
> And, best of all, it does all all of that because it’s small, not because it’s big.
vs
> Non-goals
> Obscure architectures and OSes
> I write code for x86_64 and aarch64. WASM is becoming more important, but is still secondary to native targets. I don’t care to bloat the library to support a tiny fraction of use cases.
> That being said, if you’re interested in using the library on an unsupported platform, I’m more than happy to help, and if we can make the patch reasonable, to merge it.
Those are contradictory. Either the code is extremely portable, or it can't support "obscure" platforms, but not both.
> Program directly against syscalls
Works nicely on Linux where the syscall interface is explicitly stable, but on many (most?) other platforms this is not the case.
> There Is No Heap
I don't understand what this means, when it's followed by the definition of a heap allocation interface. The paragraph after the code block conveys no useful information.
> Null-terminated strings are the devil’s work
Agreed! I also find the stance regarding perf optimization agreeable.
My impression of the sample programs is that they're unreadably noisy, but maybe this would be a good compiler target if you're writing your own language?
This doesn't look good:
c8 buf [SP_PATH_MAX] = sp_zero;
sp_cstr_copy_to_n(path, len, buf, SP_PATH_MAX);
since #define SP_PATH_MAX 4096
There should be a fallback for very long paths.It's a disadvantage, that it's header-only. It needs to include <windows.h> and a bunch of other stuff, which slow-downs compilation. Splitting it into a couple of files (a header and an implementation) would be much better.
How does this library work in programs with parts still requiring libc?
How does it deal with code executing before main? Libc does a bunch of necessary stuff, like calling initializers for global variables.
We should have left C in the 90's already, but then FOSS happened,
"Using a language other than C is like using a non-standard feature: it will cause trouble for users. Even if GCC supports the other language, users may find it inconvenient to have to install the compiler for that other language in order to build your program. So please write in C."
The GNU Coding Standard in 1994, http://web.mit.edu/gnu/doc/html/standards_7.html#SEC12
"The library’s stance, to put it simply, that the juice ain’t worth the squeeze when it comes to low level, compute-bound performance.
Designing software and data structures for performance against unknown use cases on unknown hardware is extremely difficult and the resulting code is much more complicated. Even then, it’s often better to use code written against your actual use case and hardware when performance is that critical.
Things that are off the table might be:
SIMD A highly optimized hash table rewrite Figuring out where inlining or LIKELY causes the compiler to produce better code."
LOL...
Classic vibe coder.
DJB was saying similar things in the 1990s -- eg https://cr.yp.to/proto/netstrings.txt
Just taking a quick look at the atomics section:
First, (on unix) it's wrapping pthread mutex. That's part of libc! (Technically it might not be libc.so, but it's still the standard library.)
Also, none of the atomics talk about the memory model. You don't _have_ to use the C11 memory model (Linux, for example, doesn't). But if you're not using the C11 memory model and letting the compiler insert fences for you, you definitely need to have fence instructions, yourself.
While C11 atomics do rely on libgcc, so do the __sync* functions that this library uses (see https://godbolt.org/z/bW1f7xGas) for an example.
Oops... apparently this is vibecoded. Welp, I just wasted ten minutes of my life reviewing slop that I'm not going to get back.
I do not want to include and compile a standard library for every file that includes it.
Why do standard library headers always have to be insane?
> Every language that depends on third party libraries, like js and python, is getting massively infected with supply chain worms
> Only couple of languages not affected are those that don't have a culture of downloading third party code, like C and C++
> Ex js and python developer publishes a 'library'
> Library is vibe coded
> Published on github amidst GitHub being hit by supply chain attacks, had their source code leaked.
The timing is terrible for starters, and I don't trust the vibe coded code at all. Imagine a pandemic and the cities are on fire, and you arrive to a rural town asking to kiss people.
Wonderful !
Yet another slop coded library.
What could possibly go wrong...
[1] https://wtf-8.codeberg.page/
Wait, is a compound literal an l-value in that sense (as opposed to, just being able to take its reference)?! Take a look at the C99 standard Oh my, it indeed is (C99 §6.5.2.5 p5). Good to know!