Inability to scale with sustained usage (1+ person*year of data) is the fatal problem with this cate...

dustingetz • 10/02/2024 • 4 replies • view on HN

Inability to scale with sustained usage (1+ person*year of data) is the fatal problem with this category in existing approaches. Root of this is primarily the “partial sync problem” - when dataset outgrows both the memory and compute resources available in the client device (which is unreliable, not under your control to make reliable, and resource constrained - and not everybody has the latest giga device), you have to somehow decide which working set to replicate. If the app structure is a graph and involves relational queries, this is undecidable without actually running the queries! If the app structure is constrained to topic documents, you still have to choose which topics to cache and it still falls over at document sizes easily seen in prolonged solo use let alone use at an enterprise which market is necessary to justify VC investment. All this in an increasingly connected world where the cloud is 10ms away from city/enterprise users (the $$$) and any offline periods (subway etc) resolve quickly, so a degraded offline mode (letting you edit whatever you were last looking at w/o global relational query etc) is often acceptable. Oh, and does your app need AI?

Change my view!

Replies

jakelazaroff • 10/02/2024

Author here! A few thoughts:

1. That "in existing approaches" qualifier is important — local-first is still very much a nascent paradigm, and there are still a lot of common features that we don't really know how to implement yet (such as permissioning). You might be correct for the moment, but watch this space!

2. I think most apps that would benefit from a local-first architecture do not have the monotonically growing dataset you're describing here. Think word processors, image editors, etc.

3. That said, there are some apps that do have that problem, and local-first probably just isn't the right architecture for them! There are plenty of apps for which client-server is a fundamentally better architecture, and that's okay.

4. People love sorting things into binaries, but it doesn't have to be zero-sum. You can build local-first features (or, if you prefer, offline-first features) into a client-server app, and vice versa. For example, the core experience of Twitter is fundamentally client-server, but the bookmarking feature would benefit from being local-first so people can use it even when they're offline.

➕ show 1 reply

matlin • 10/02/2024

A lot of the newer local first systems, like Triplit (biased because I work on it), support partial replication so only the requested/queried data is sent and subscribed to on the client.

The other issue of relying on a just the server to build these highly collaborative apps is you can't wait for a roundtrip to the server for each interaction if you want it to feel fast. Sure you can for those rare cases where your user is on a stable Wifi network, on a fast connection, and near their data; however, a lot computing is now on mobile where pings are much much higher than 10ms and on top of that when you have two people collaborating from different regions, someone will be too far away to rely on round trips.

Ultimately, you're going to need a client caching component (at least optimistic mutations) so you can either lean into that or try to create a complicated mess where all of your logic needs to be duplicated in the frontend and backend (potentially in different programming languages!).

The best approach IMO (again biased to what Triplit does) is to have the same query engine on both client and server so the exact query you use to get data from your central server can also be used to resolve from the clients cache.

vardump • 10/02/2024

Just an idea: perhaps all of the end devices should have at least some high reliability storage. This would enable local applications that require high data durability and integrity.

Probably it'd require ECC RAM to prevent in memory bitrot, multiple copies of blocks (or even multiple physical block devices) with strong checksums.

Perhaps this data should somehow "automagically" sync between all locally available devices, again protected with strong checksums at every step.

(This idea requires some refining.)

➕ show 2 replies

ForHackernews • 10/02/2024

Extremely curious what kinds of data your app would produce such that it outgrows the memory available on a single client device. I have about 2 person-decades' worth of writing, art, and coding that occupies less than 10 GB (and could probably be made smaller).

➕ show 1 reply

alt Hacker News

Replies