Note that all of this reflects design decisions on Bluesky's closed-source "AppView" server—any federated servers interacting with Bluesky would need to construct their own timelines, and do not get the benefit of the work described here.
This is not true. Third party PDSes are fully supported by our app view, and our app view generates timelines for all the users on those PDSes.
What reason does Bluesky give for not opening up their AppView code?
Another notable component that is closed source is the discovery feed generator, where at least there is some reason.
My thinking has evolved on this topic significantly as of late. My current thinking is we should create a secure gossip network on top of the Bluesky API, and forgot about all the DAG-CBOR stuff that gets stripped from the Jetstream. Hash the posts on the gossip layer and if posts change then diff them. This is all prep for when X billionaire buys out Bluesky then we just pop some signing key crypto on top of this gossip layer and wow! It's distributed!
As others have noted, the appview is open source. The dataplane has two implementations, one in postgres and another in scylla. The scylla dataplane is closed, the postgres one is open.
The interesting next stage for the postgres implementation is to create a sync engine for partial syncs of the network, so that an appview can run affordably. We ran some benches on the current state of the postgres implementation and found we could index 300k users on a $100/mo vps. I think with a couple of weeks of optimization that could reach 1mm users.