Maybe I'm too stupid to understand the article... How does this achieve performant querying for olap and oltp purposes?
Based on my understanding, olap queries will go to the parquet files which are stored in a columnar fashion and oltp style queries will go to a caching layer that sits on top of those parquet files?
What's the special sauce here? Seems like they're just caching the data which, for all intents and purposes, seems like the same solution of storing another copy of the data which is what they say they're avoiding.
Hi, I work on Lakebase (but not on storage), here's how I understand it.
For Lakebase and Neon, our architecture needs the caching layer regardless (what we call Pageservers). Performing reads from S3 directly is too slow so we reconstruct pages and keep them on an nvme server for faster querying. Changing the format on S3 to be Parquet effectively introduces no additional copies over our existing architecture