logoalt Hacker News

mrkeenlast Monday at 9:19 PM1 replyview on HN

One of the perks of being distributed, I guess.

The kind of failure that a system can tolerate with strict fsync but can't tolerate with lazy fsync (i.e. the software 'confirms' a write to its caller but then crashes) is probably not the kind of failure you'd expect to encounter on a majority of your nodes all at the same time.


Replies

johncolanduonilast Monday at 10:23 PM

It is if they’re in the same physical datacenter. Usually the way this is done is to wait for at least M replicas to fsync, but only require the data to be in memory for the rest. It smooths out the tail latencies, which are quite high for SSDs.

show 2 replies