I just realized that one could probably write a userspace io_uring emulator in a library that spawns a thread to read the ringbuffer and a worker pool of threads to do the blocking operations. You'd need to get the main software to make calls to your library instead of the io_uring syscalls, that's it; the app logic could remain the same.
Then all the software wanting to use io_uring wouldn't need to write their low-level things twice.
I'm about to start something like this targetting epoll, poll, dispatch_io and maybe kqueue this weekend.