Good point. I'm often running multiple parallel jobs with varying priorities where uniform throttling actually makes sense. Many LLM inference tasks are long-running but not fully utilizing hardware (often waiting on I/O or running at partial capacity)
The dual Epyc CPUs (128 cores) in my setup have a relatively high idle power draw compared to consumer chips. Even when "idle" they're consuming significant power maintaining all those cores and I/O capabilities. By implementing uniform throttling when utilization is low, the automation actually reduces the baseline power consumption by a decent amount without much performance hit.
Good point. I'm often running multiple parallel jobs with varying priorities where uniform throttling actually makes sense. Many LLM inference tasks are long-running but not fully utilizing hardware (often waiting on I/O or running at partial capacity)
The dual Epyc CPUs (128 cores) in my setup have a relatively high idle power draw compared to consumer chips. Even when "idle" they're consuming significant power maintaining all those cores and I/O capabilities. By implementing uniform throttling when utilization is low, the automation actually reduces the baseline power consumption by a decent amount without much performance hit.