logoalt Hacker News

aeonfoxtoday at 10:07 AM3 repliesview on HN

A real interesting read as someone who spends a bit of time with Elixir. Wasn't aware of the atomic and counter Erlang features that break isolation.

Though they do say that race conditions are purely mitigated by discipline at design time, but then mention race conditions found via static analysis:

> Maria Christakis and Konstantinos Sagonas built a static race detector for Erlang and integrated it into Dialyzer, Erlang’s standard static analysis tool. They ran it against OTP’s own libraries, which are heavily tested and widely deployed.

> They found previously unknown race conditions. Not in obscure corners of the codebase. Not in exotic edge cases. In the kind of code that every Erlang application depends on, code that had been running in production for years.

I imagine that the 4th issue of protocol violation could possibly be mitigated by a typesafe abstracted language like Gleam (or Elixir when types are fully implemented)


Replies

WJWtoday at 10:33 AM

> They found previously unknown race conditions. Not in obscure corners of the codebase. Not in exotic edge cases. In the kind of code that every Erlang application depends on, code that had been running in production for years.

If these race conditions are in code that has been in production for years and yet the race conditions are "previously unknown", that does suggest to me that it is in practice quite hard to trigger these race conditions. Bugs that happen regularly in prod (and maybe I'm biased, but especially bugs that happen to erlang systems in prod) tend to get fixed.

show 2 replies
dnauticstoday at 3:13 PM

not all races are bugs. here's an example that probably happens in many systems that people just don't notice: sometimes you don't care and, say, having database setup race against setup of another service that needs the database means that in 99% of cases you get a faster bootup and in 1% of cases the database setup is slow and the dependent server gets restarted by your application supervisor and connects on the second try.

kamma4434today at 11:14 AM

The 4th issue is a feature- it’s what allows zero downtime hot updates.