My limited understanding here is that usage loads impact model outputs to make them less deterministic (and likely degrading in quality). See: https://thinkingmachines.ai/blog/defeating-nondeterminism-in...