logoalt Hacker News

IgorPartolatoday at 1:13 AM1 replyview on HN

Say you are debugging a memory leak in your own code that only shows up in production. How do you propose to do that without direct access to a production container that is exhibiting the problem, especially if you want to start doing things like strace?


Replies

joshuamortontoday at 2:08 AM

I will say that, with very few exceptions, this is how a lot of $BigCo manage everyday. When I run into an issue like this, I will do a few things:

- Rollback/investigate the changelog between the current and prior version to see which code paths are relevant

- Use our observability infra that is equivalent to `perf`, but samples ~everything, all the time, again to see which codepaths are relevant

- Potentially try to push additional logging or instrumentation

- Try to better repro in a non-prod/test env where I can do more aggressive forms of investigation (debugger, sanitizer, etc.) but where I'm not running on production data

I certainly can't strace or run raw CLI commands on a host in production.

show 2 replies