logoalt Hacker News

drodmantoday at 6:07 PM1 replyview on HN

In general you do need to be aware of any agent-level rate limits as well as any ingestion limits from the provider. We do some pretty careful sampling and aggregations for most metrics, logs, and traces we store and as mmcclure said in this case it was the rules on the node agents themselves throwing the errors. The volume logging on some of the critical paths of the service got high enough that the logs were dropped due to our configured rate limits.


Replies