> There seems to be a strong "instrument everything" culture
Metrics are the easiest way to simply expose your application internal state and then, as a maintainer of that service, you’re in nirvana. And even if you don’t go that far you’re likely to be an engineer writing code and when it comes time to add some metrics why wouldn’t you add more rather than less, and once you have all of them why not adding all possible labels? And in the meantime your Prometheus server is in a crash loop because it run if of RAM, but that’s not a problem visible to you. Unfortunately there’s a big gap in understanding between a code editor writing instrumentation code and the effect in resource usage on the other end of your observability pipeline.
I can only say, I tried to add massive amounts of data points to a fleet of battery systems once; 750 cells per system, 8 metrics per cell, one cell every 20 ms. It became megabits per second, so we only enabled it when engaging the batteries. But the data was worth it, because we could do data modelling on live events in retrospect when we were initially too busy fixing things. Observability is a super power.