logoalt Hacker News

Jepsen: NATS 2.12.1

298 pointsby aphyryesterday at 6:51 PM107 commentsview on HN

Comments

stmwyesterday at 8:57 PM

Every time someone builds one of these things and skips over "overcomplicated theory", aphyr destroys them. At this point, I wonder if we could train an AI to look over a project's documentation, and predict whether it's likely to lose commmitted writes just based on the marketing / technical claims. We probably can.

show 2 replies
rishabhaioveryesterday at 10:22 PM

NATS be trippin, no CAP.

show 1 reply
johncolanduoniyesterday at 9:42 PM

Wow. I’ve used NATS for best-effort in-memory pub/sub, which it has been great for, including getting subtle scaling details right. I never touched their persistence and would have investigated more before I did, but I wouldn’t have expected it to be this bad. Vulnerability to simple single-bit file corruption is embarrassing.

vrnvuyesterday at 7:23 PM

Sort of related. Jepsen and Antithesis recently released a glossary of common terms which is a fantastic reference.

https://jepsen.io/blog/2025-10-20-distsys-glossary

merbyesterday at 7:34 PM

> 3.4 Lazy fsync by Default

Why? Why do some databases do that? To have better performance in benchmarks? It’s not like that it’s ok to do that if you have a better default or at least write a lot about it. But especially when you run stuff in a small cluster you get bitten by stuff like that.

show 6 replies
mysfiyesterday at 11:30 PM

Curious about the differences between content on aphyr.com/tags/jepsen and jepsen.io/analyses. I recently discovered aphyr.com and was excited about the potential insights!

show 1 reply
rdtscyesterday at 8:16 PM

> By default, NATS only flushes data to disk every two minutes, but acknowledges operations immediately. This approach can lead to the loss of committed writes when several nodes experience a power failure, kernel crash, or hardware fault concurrently—or in rapid succession (#7564).

I am getting strong early MongoDB vibes. "Look how fast it is, it's web-scale!". Well, if you don't fsync, you'll go fast, but you'll go even faster piping customer data to /dev/null, too.

Coordinated failures shouldn't be a novelty or a surprise any longer these days.

I wouldn't trust a product that doesn't default to safest options. It's fine to provide relaxed modes of consistency and durability but just don't make them default. Let the user configure those themselves.

show 7 replies
shikharyesterday at 11:05 PM

If you are looking for a serverless alternative to JetStream, check out https://s2.dev

Pros: unlimited streams with the durability of object storage – JetStream can only do a few K topics

Cons: no consumer groups yet, it's on the agenda

show 1 reply
maxmcdyesterday at 8:20 PM

> > You can force an fsync after each messsage [sic] with always, this will slow down the throughput to a few hundred msg/s.

Is the performance warning in the NATS possible to improve on? Couldn't you still run fsync on an interval and queue up a certain number of writes to be flushed at once? I could imagine latency suffering, but batches throughput could be preserved to some extent?

show 1 reply
dangoodmanUTyesterday at 11:49 PM

Half-expected tbh, but didn’t expect to be this bad.

Just use redpanda.

clemlesneyesterday at 7:41 PM

NATS is a fantastic piece of software. But doc’s unpractical and half backed. That’s a shame to be required to retro engineer the software from GitHub to know the auth schemes.

show 2 replies
dzongayesterday at 9:32 PM

nats jetstream vs say redis streams - which one have people found easier to work with ?

show 1 reply
gostsamoyesterday at 7:57 PM

Thanks, those reports are always a quiet pleasure to read even if one is a bit far from the domain.

selectodudeyesterday at 7:12 PM

Definitely thought this was about aviation for a moment.

show 3 replies