I’m only an armchair expert on Erlang. But, having looked into it repeatedly for a couple decades, my take-away is the “Let it crash” slogan is good. But, also presented a bit out of context. Or, at least assuming context that most people don’t have.
Erlang is used in situations involving a zillion incoming requests. If an individual request fails… Maybe it was important. Maybe it wasn’t. If it was important, it’s expected they’ll try again. What’s most important is that the rest of the requests are not interrupted.
What makes Erlang different is that it is natural and trivial to be able to shut down an individual request on the event of an error without worrying about putting any other part of the system into a bad state.
You can pull this off in other languages via careful attention to the details of your request-handling code. But, the creators of the Erlang language and foundational frameworks have set their users up for success via careful attention to the design of the system as a whole.
That’s great in the contexts in which Erlang is used. But, in the context of a Java desktop app like Open Office, it’s more like saying “Let it throw”. “It” being some user action. And, the slogan being to have a language and framework with such robust exception handling built-in that error handling becomes trivial and nearly invisible.
> You can pull this off in other languages via careful attention to the details of your request-handling code. But, the creators of the Erlang language and foundational frameworks have set their users up for success via careful attention to the design of the system as a whole.
+10. So many people miss this very important point. If you have lots of mutable shared state, or can accidentally leak such into your actor code then the whole actor/supervision tree thing falls over very easily... because you can't just restart any actor without worrying about the rest of the system.
I think this is a large (but not the only[0]) part of why actors/supervisors haven't really caught on anywhere outside of Erlang, even for problem spaces where they would be suitable.
[0] I personally feel the model is very hard to reason about compared to threaded/blocking straight-line code using e.g. structured concurrency, but that may just be a me thing.
Let it crash, so that if something goes wrong, it does not do so silently.
Let it crash, because a relevant manager will detect it, report it, clean it up, and restart it, without you having to write a line of code for that.
Let it crash as soon as possible, so that any problem (like a crash loop) is readily visible. It's very easy to replace arbitrary bits of Erlang code in a running system, without affecting the rest of it. "Fix it in prod" is better than "miss it in prod", especially when you cannot stop the prod ever.