27us roundtrip is not really state of the art for zero copy IPC, about 1us would be. What is causing this overhead?
It may or may not be good, depending on a number of fact.
I did read the original linux zerocopy papers from google for example, and at the time (when using tcp) the juice was worth the squeeze when payload was larger than than 10 kilobytes (or 20? Don’t remember right now and i’m on mobile).
Also a common technique is batching, so you amortise the round-trip time (this used to be the cost of sendmmsg/recvmmsg) over, say, 10 payloads.
So yeah that number alone can mean a lot or it can mean very little.
In my experience people that are doing low latency stuff already built their own thing around msg_zerocopy, io_uring and stuff :)
It's not a local IPC exactly. The roundtrip benchmark stat is for a TCP server-client ping/pong call using a 2 KB payload; TCP is although on local loopback (127.0.0.1).
Source: https://github.com/mvp-express/myra-transport/blob/main/benc...
indeed, you can get a packet from one box to another in 1-2us
Asking for those who, like me, haven't yet taken the time to find technical information on that webpage:
What exactly does that roundtrip latency number measure (especially your 1us)? Does zero copy imply mapping pages between processes? Is there an async kernel component involved (like I would infer from "io_uring") or just two user space processes mapping pages?