Why? Unlike loads involving a real physical process there is absolutely no need for AI-training to be constant.
If you invest into chips that deprecate in value really fast, not utilizing them to their full capacity because of power constraints would be counter productive.
You are correct in the sense that they can stop work in a way many generic server use cases can't (which is seen in lowering power supply reliability requirements as the article mentions), but running expensive servers at 50% utilization would dramatically affect the revenue generated per capital invested - IE you couldn't afford to buy the servers.