> If something needs to be fixed, why is it just a log? What he meant is that is an unexpected ...

Copenjin • yesterday at 9:36 PM • 1 reply • view on HN

> If something needs to be fixed, why is it just a log?

What he meant is that is an unexpected condition, that should have never happened, but that did, so it needs to be fixed.

> How is someone supposed to even notice a random error log?

Logs should be monitored.

> At the places that I've worked, trying to make alerting be triggered on only logs was always quite brittle, it's just not best practice.

Because the logs sucked. It not common practice, it should be best practice.

> Throw an exception / exit the program if it's something that actually needs fixing!

I understand the sentiment, but some programs cannot/should not exit. Or you have an error in a subsystem that should not bring down everything.

I completely agree with the approach of the author, but also understand that good logging discipline is rare. I worked in many places where logs sucked, they just dumped stuff, and had to restructure them.

Replies

lanstin • today at 1:37 AM

While it is fun to have your code run for 500 days without restart, it is a bad architecture. You should be able to move load around from host to host or network to network without losing any work. This involves graceful draining and then shutting down the old.

For impossible errors exiting and sending the dev team as much info as possible (thread dump, memory dump, etc) is helpful.

In my experience logs are good for finding out what is wrong once you know something is wrong. Also if the server is written to have enough but not too much logging you can read them over and get a feel for normal operation.

alt Hacker News

Replies