A bit more color on what I found interesting building this: The motivating question for me was les...

russellthehippo • yesterday at 7:08 PM • 1 reply • view on HN

A bit more color on what I found interesting building this:

The motivating question for me was less “can SQLite read over the network?” and more “what assumptions break once the storage layer is object storage instead of a filesystem?”

The biggest conceptual shift was around *layout*.

What felt most wrong in naive designs was that SQLite page numbers are not laid out in a way that matches how you want to fetch data remotely. If an index is scattered across many unrelated page ranges, then “prefetch nearby pages” is kind of a fake optimization. Nearby in the file is not the same thing as relevant to the query.

That pushed me toward B-tree-aware grouping. Once the storage layer starts understanding which table or index a page belongs to, a lot of other things get cleaner: more targeted prefetch, better scan behavior, less random fetching, and much saner request economics.

Another thing that became much more important than I expected is that *different page types matter a lot*. Interior B-tree pages are tiny in footprint but disproportionately important, because basically every query traverses them. That changed how I thought about the system: much less as “a database file” and much more as “different classes of pages with very different value on the critical path.”

The query-plan-aware “frontrun” part came from the same instinct. Reactive prefetch is fine, but SQLite often already knows a lot about what it is about to touch. If the storage layer can see enough of that early, it can start warming the right structures before the first miss fully cascades. That’s still pretty experimental, but it was one of the more fun parts of the project.

A few things I learned building this:

1. *Cold point reads and small joins seem more plausible than I expected.* Not local-disk fast, obviously, but plausible for the “many mostly-cold DBs” niche.

2. *The real enemy is request count more than raw bytes.* Once I leaned harder into grouping and prefetch by tree, the design got much more coherent.

3. *Scans are still where reality bites.* They got much less bad, but they are still the place where remote object storage most clearly reminds you that it is not a local SSD.

4. *The storage backend is super important.* Different storage backends (S3, S3 Express, Tigris) have verg different round trip latencies and it's the single most important thing in determining how to tune prefetching.

Anyway, happy to talk about the architecture, the benchmark setup, what broke, or why I chose this shape instead of raw-file range GETs / replication-first approaches / etc.

Replies

hgo • yesterday at 8:06 PM

I really appreciate this post. Freely and humbly sharing real insights from an interesting project. I almost feel like I got a significant chunk of the reward for your investment into this project just by reading.

Thank you for sharing.

➕ show 1 reply

alt Hacker News

Replies