AFAIK Cloudflare runs their whole stack on every machine. I guess that gives them flexibility and maybe better load balancing. They also seem to use only one NIC.
why is tcp/ip special? The kernel should expose a raw network device. ... Why run a complex network protocol stack in the kernel when you can just expose a configured layer 2 device to a user space process?
Check out the MIT Exokernel project and Solarflare OpenOnload that used this approach. It never really caught on because the old school way is good enough for almost everyone.
why stop there, why not TLS as well?
kTLS is a thing now (mostly used by Netflix). Back in the day we also had kernel-mode Web servers to save every cycle.
Was it Tux? I've only used it, a looong time ago, on load balancers.
https://en.wikipedia.org/wiki/TUX_web_server