One nice thing about some modern formats is that there are tools that read them at extraordinarily high effective speed. For example, DuckDB can do all manner of nifty optimizations while reading its own native format or Parquet. And I’m not sure that those optimizations can be effectively applied to a format that needs a WASM blob to be run to understand it. By the time you run a non-SIMD or even a SIMD-optimized pass over app the data, if that pass doesn’t understand your query, you may have already lost.
I admit I only skimmed the beginning of the paper, and maybe the format is less general than it sounds.
My understanding is it’s a fallback mechanism