This bit is quite genius, rather than depend on a language-specific SDK/lib for working with the formats you can fallback to exported WASM methods if none exist:
> "Each self-describing F3 file includes both the data and meta-data, as well as WebAssembly (Wasm) binaries to decode the data. Embedding the decoders in each file requires minimal storage (kilobytes) and ensures compatibility on any platform in case native decoders are unavailable. "I don't understand how that's supposed to work. What does the decoder decode into? That's gonna depend entirely on the kind of data, right? For some formats, it's gonna be a stream of bytes; for others, a 2D plane of pixels; others again will need vertexes, 2D planes of pixels and UV maps; for some, an object graph will make more sense.
It sounds neat, but feels like it might fall apart with higher-complexity formats. What does an embedded decoder for a PDF look like? I guess since they are tightly-coupled to the file bytes themselves, the author of the file gets to choose what formats make sense, but not all formats have a one-true-decode-step.
Applets redux.
Is embedding executable code into a file a security risk? My assumption is a yes
except you need flatbuffers to access that blob
I would call it clever. I'm not sure I'd call it genius.
When I'm working with data I'm working in a specific set of languages. Usually one. Yeah, other people might be working in other languages, but no individual author really needs a language-agnostic way of accessing data beyond compile time. Add to that the likely runtime boundaries that may need to be crossed instead of e.g. inlined by the compiler because it's in-language and dealing with known offsets or tags (depends on the data format of course). To the other commenter's point, am I going to have to sandbox all data access code just to be sure it's not able to do something unexpected? There's a lot of complexity here. And the inherent risk is going to slow down the operation that should be the simplest and fastest: interpreting bytes.
So attackers don't have to craft specially corrupted files? They can just include the code to perform the attack in the data file itself?