Why isn’t it possible — or is it — to make libc just use uring instead of syscall?
Yes I know uring is an async interface, but it’s trivial to implement sync behavior on top of a single chain of async send-wait pairs, like doing a simple single threaded “conversational” implementation of a network protocol.
It wouldn’t make a difference in most individual cases but overall I wonder how big a global speed boost you’d get by removing a ton of syscalls?
Or am I failing to understand something about the performance nuances here?
Not speaking of ls which is more about metadata operations, but general file read/write workloads:
io_uring requires API changes because you don't call it like the old read(please_fill_this_buffer). You maintain a pool of buffer that belong to the ringbuffer, and reads take buffers from the pool. You consume the data from the buffer and return it to the pool.
With the older style, you're required to maintain O(pending_reads) buffers. With the io_uring style, you have a pool of O(num_reads_completing_at_once) (I assume with backpressure but haven't actually checked).
In addition to sibling's concern about syscall amplification, the async just isn't useful to the application (from a latency perspective) if you just serialize a bunch of sync requests through it.
In order to make this work, libc would have to:
- Start some sort of async executor thread to service the io_uring requests/responses
- Make it so every call to "normal" syscalls causes the calling thread to sleep until the result is available (that's 1 syscall)
- When the executor thread gets a result, have it wake up the original thread (that's another syscall)
So you're basically turning 1 syscall into 2 in order to emulate the legacy syscalls.
io_uring only makes sense if you're already async. Emulating sync on top of async is nearly always a terrible idea.