That framing doesn't make sense. System calls and their arguments are an obvious security bound...

tptacek • 10/12/2024 • 2 replies • view on HN

That framing doesn't make sense. System calls and their arguments are an obvious security boundary and have been a sandboxing component for decades. io_uring blows that boundary apart. The "problem" is io_uring, not seccomp.

If you want to make a case for io_uring being benign for security, the right argument is probably against all unmediated shared-kernel multitenancy (ie: multitenancy either through virtualization, or WASM/V8-type language runtimes, and nothing else). It doesn't make sense to say system call filters are flawed because someone came up with an omni-syscall that breaks those filters.

Replies

asveikau • 10/12/2024

The syscall implementations themselves do checks and return EPERM/EACCES when appropriate. The mechanism for doing the syscall can change. I mean, in the 90s it happened via int 0x80, then we got sysenter, then the vdso. io_uring just moved part of it to user mode.

It seems like a totally reasonable design to me to "just" put the right hooks into the filter mechanism and make it get called the same way regardless of the syscall mechanism.

thayne • 10/12/2024

The obvious solution is to block operations over io_uring if the equivalent syscall would have been blocked by seccomp. But I'm not sure if there is some reason that wouldn't work.

Another possibility would be to allow setting restrictions on all io_uring operations for the current and all child processes, although that would be less convenient than using the existing seccomp system.

➕ show 1 reply

alt Hacker News

Replies