Can you elaborate on the complexity here for syscall entry on x86_64? (Or link to what you were reading?) Another commenter linked to Linux's own "nolibc" which is similar to, though simpler than, the Google project in the OP. Their x64_64 arch support is here, which looks simple enough, putting things into registers: https://github.com/torvalds/linux/blob/master/tools/include/...
The non-arch-specific callers which use this are here, which also look relatively straightforward: https://github.com/torvalds/linux/blob/master/tools/include/...
I don't see any complex stack alignment or anything which reads to me like it would require "niche C compiler options", so I'm curious if I'm missing something?
You linked the same file twice, was that intentional?