WASM has strong tried and proven sandboxing. We basically can build on nearly 30 years of experience. The decoders don't need a lot of access, they can basically be pure functions.
If this will pan out security-wise I don't know. I'm more worried that it will be so slow that no one will use it. Interesting idea, though, and I can see applications outside of the "big data" realm this apparently targets.
WASM implementations are fairly mature now, but if there was e.g. an image file format with embedded WASM that needed to execute before you could view it, it would become the new low-hanging-fruit target for 0-click RCEs - whether it's exploiting the WASM engine itself or some other attack surface that's influenceable via it (See also, the FORCEDENTRY JBIG2 exploit).
> The decoders don't need a lot of access, they can basically be pure functions
They don't currently either do they? It's the tight coupling of the interface layer no? I'm not sure this would be faster, or more secure so reliability might be the best usecase?
> WASM has strong tried and proven sandboxing. We basically can build on nearly 30 years of experience. The decoders don't need a lot of access, they can basically be pure functions.
I've heard that kind of sentiment many times before. It's not a good (thought-terminating) mindset to have for any secure software.
There are several WASM implementations, WASM is just a format. "Pure functions" are pure at a superficial level. Many people say that they don't mutate global state, but they do ... it's just hidden. The decoders "not needing a lot of access" doesn't matter if the WASM engine is pwned through arbitrary code execution inside the environment, or if it's contorted to bypass the access control you are mentioning through various side-effects.
How do you prevent compression bomb attacks when files can define their own compression functions?
You could have some kind of OOM killer, but that will be a "footgun" that people who are actually doing "big data" will constantly shoot.
This pretty much kills any ingestion pipeline where the source is untrusted.