This could use a bit more "why".
Shortcomings of Parquet are mentioned as overcome by this, which ones? Certainly not wide tool support...
Why should one leave Parquet or ORC for this structure?
The ‘why’ is referenced in the bibliography at the end of the readme; this repo is not meant to be consumed standalone. Start with the paper instead:
Yeah it seems like most of this can be handled by some more dev hours to Parquet
Paper mentions Parquet, ORC, Nimble, Lance, TSFile, Bullion, and BtrBlocks.
I also had no idea what they were talking about, but there's good points about how hardware oblivious and somewhat global is Parquet around metadata.
I found this post interesting,
- https://medium.com/@reliabledataengineering/f3-the-future-pr...