the visible cost of burning runway on a bill is very often far less than the invisible cost of burning engineer time rebuilding undifferentiated heavy lifting rather than working on product/customer needs
Most of the complexity in observability is clientside.
It is not hard to spin up Grafana and VictoriaMetrics (and now VictoriaLogs) and keep them running. It is not hard to build a Grafana dashboard that correlates data across both metrics and logs sources, and alerting functionality is pretty good now.
The "heavy lift" is instrumenting your applications and infrastructure to provide valuable metrics and logs without exceeding a performance budget. I'm skeptical that Datadog actually does much of that heavy-lifting and that they are actually worth the money. You can probably save 10x with same/better outcomes by paying for managed Grafana + managed DBs and a couple FTEs as observability experts.
This is very well stated.
People say this but I wonder about this from time to time. I don't think anyone is asking to rebuild datadog from scratch for your company but surely it's worth it to migrate to something not as expensive even if it takes a bit of elbow grease.
I wouldn't really say "very often". Occasionally, perhaps.
Even from a pure zero-sum mathematical perspective, it can make sense to invest even as much as 2 or 3 months of engineer time on cloud cost savings measures. If the engineer is making $200K, that's a $30000 - $50000 investment. When you see the eye-watering cloud bills many startups have, you would realize that, that investment is peanuts in comparison to the potential savings over the next several years.
And then you also have to keep in mind that, these things are usually not actually zero-sum. The engineer could be new, and working on the efficiency project helps them onboard to your stack. It could be the case that customers are complaining (or could start complaining in the future) about how slow your product is, so you actually improve the product by improving the infrastructure. Or it could just be the very common case that there isn't actually a higher-value thing for that engineer to be working on at that time.