logoalt Hacker News

Bendertoday at 12:54 AM1 replyview on HN

Managed over 50k servers with zero swap. Set overcommit ratio to 0, min_free configured based on a Redhat formula and had application teams keep some memory free. Adjust oom scores at application startup especially for database servers where panic is set to 0.

Servers ranged from 144GB ram to 3TB ram and that memory is heavily utilized. On servers meant to be stateless app and web servers panic was set to 2 to reboot on oom which mostly occurred in the performance team that were constantly load testing hardware and apps and a few dev machines were developers were not sharing nicely. Engineered correctly OOM will be very rare and this only gets better with time as applications have more controls over memory allocation and other tools like namespaces/cgroups. Java will always leak, just leave more room for it.


Replies

anyfootoday at 2:35 AM

There's a chance that those servers might run more efficiently with some swap space, for the reasons mentioned many times in this thread. Swap space is not just for overcommitting.

show 1 reply