The training cost for a model is constant. The more individual use that model gets the lower the training-cost-per-inference-query gets, since that one-time training cost is shared across every inference prompt.
It is true that there are always more training runs going, and I don't think we'll ever find out how much energy was spent on experimental or failed training runs.
> The training cost for a model is constant
Constant until the next release? The battle for the benchmark-winning model is driving cadence up, and this competition probably puts a higher cost on training and evaluation too.