logoalt Hacker News

SoKamiltoday at 4:04 PM1 replyview on HN

FoundationDB is awesome testing wise as they have deterministic simulation testing [1] that can simulate distributed and operating system failures.

> We wanted FoundationDB to survive failures of machines, networks, disks, clocks, racks, data centers, file systems, etc., so we created a simulation framework closely tied to Flow. By replacing physical interfaces with shims, replacing the main epoll-based run loop with a time-based simulation, and running multiple logical processes as concurrent Flow Actors, Simulation is able to conduct a deterministic simulation of an entire FoundationDB cluster within a single-thread! Even better, we are able to execute this simulation in a deterministic way, enabling us to reproduce problems and add instrumentation ex post facto. This incredible capability enabled us to build FoundationDB exclusively in simulation for the first 18 months and ensure exceptional fault tolerance long before it sent its first real network packet. For a database with as strong a contract as the FoundationDB, testing is crucial, and over the years we have run the equivalent of a trillion CPU-hours of simulated stress testing.

[1]https://pierrezemb.fr/posts/notes-about-foundationdb/#simula...


Replies

gioazzitoday at 5:37 PM

And they went on to build Antithesis to deliver the same capabilities to other systems, pretty cool stuff!

[1]: https://antithesis.com/company/backstory/

show 1 reply