> I was expecting a unified interface across all architectures, with perhaps one or two architecture-specific syscalls to access architecture-specific capabilities; but Linux syscalls are more like Swiss cheese.
There's lots of historical weirdness, mostly around stuff where the kernel went "oops, we need 64-bit time_t or off_t or whatever" and added, for example, getdents64 to old platforms, but new platforms never got the broken 32-bit version. There are some more interesting cases, though, like how until fairly recently (i.e. about a decade ago for the mainline kernel), on x86 (and maybe other platforms?) there weren't individual syscalls for each socket syscall, they were all multiplexed through socketcall.
> In an ideal world, there would be a header-only C library provided by the Linux kernel; we would include that file and be done with it. As it turns out, there is no such file, and interfacing with syscalls is complicated.
Because Linux is the exception, UNIX public API is the C library as defined later by POSIX.
The goal to create C and rewrite UNIX V4 into C was exactly to move away from this kind of platform details.
Also UNIX can be seen as C's runtime, in a way, thus traditionally the C compiler was the same of the platform vendor, there were not pick and chose among C compilers and standard libraries, that was left for non-UNIX platforms.
> In an ideal world, there would be a header-only C library provided by the Linux kernel; we would include that file and be done with it. As it turns out, there is no such file, and interfacing with syscalls is complicated.
Isn't that nolibc.h?
I've been thinking about doing this for a little side project for some time. Looking forward to the eventual conclusion :)
There is an existing project that tracks and gather syscalls in the linux kernel, for all ABIs: https://github.com/mebeim/systrack . The author maintains a table here, which is incredibly useful: https://syscalls.mebeim.net/?table=x86/64/x64/latest