I used Munin a lot as well in the 2005-2010 timeframe. Still do as a backup (for when Prometheus, Grafana, and Influxdb conspire against me) on my home lab.
Usually the 15 minute collection interval is just fine. One time though I had an issue with servers that were just fine and, then, crashed and rebooted with no useful metrics collected between the last "I'm fine" and the first "I'm fine again".
At that point we started collecting metrics (for only those servers) every 5 seconds, and we figured out someone introduced a nasty bug that took a couple weeks of uptime to run out of its own memory and crash everything. It was a fun couple days.
And, for a lot of things, it's quite sufficient.
I used Munin a lot as well in the 2005-2010 timeframe. Still do as a backup (for when Prometheus, Grafana, and Influxdb conspire against me) on my home lab.
Usually the 15 minute collection interval is just fine. One time though I had an issue with servers that were just fine and, then, crashed and rebooted with no useful metrics collected between the last "I'm fine" and the first "I'm fine again".
At that point we started collecting metrics (for only those servers) every 5 seconds, and we figured out someone introduced a nasty bug that took a couple weeks of uptime to run out of its own memory and crash everything. It was a fun couple days.