PKM apps are trees of strings! It's fast until its not. Even if you can sync the global dataset to the device storage, the query engine needs data in process memory and still has to traverse it with device levels of compute not cloud compute, "instantly" i.e. without making the UI feel sluggish. If you feel otherwise, use Roam or Tana for a year, even in single-player mode. The entire category is bottlenecked on this scale problem. And now add team support, because you want to sell this to teams and make money, right? Designing for casual, personal-sized datasets is a viable architecture in very few apps. Google Maps is one shining counterpoint, because the content has a natural locality to it – you only need to sync content near where you are geographically!
> And now add team support, because you want to sell this to teams and make money, right?
I mean, I don't, personally. I'm writing a couple small apps to scratch my own itches and I might sell them to anyone else who wants an individual copy for personal use.
Remember when you could just buy a copy of a program and use it on your own computer? And it would never get updated to remove functionality or break because some servers were shut down? That's the experience I'm seeking from local-first software.
I think designing for casual, personal-sized data is extremely easy if you give up the idea that every program needs to be some bloated Enterprise-Ready junkware.
I think you’re misunderstanding the overall architecture here. Instead of syncing the whole tree of strings, the way you would generally represent a PKM with Yjs is to make each logical document a Yjs document (especially given the assumption that offline periods are short.)
You could still build a server-side search index over those documents, which never needs to be sent to the client.