it's not an either or, they can easily let me configure any kind of behavior that I want. No cap, a hard cap, a soft cap, a cap that I program with a python script, a cap where I throttle, a cap where I opt in to deleting certain machines to save money. It can all be done. People are complaining because obvious features are not provided. People would not be complaining if they had all the options that we needed to control how to scale resources in response to load, not just technical load but also financial load.
You're oversimplifying the problem in the other direction. Fine-grained scriptability of hard limits would bump up against all of the thorny distributed systems problems. But I do agree that fixing the simple cases is straightforward - maximum spend rates per instant and per unit of time (eg per minute, hour, day, month). Providers would shoulder the small costs from the slightly-leaky assumptions they have to make to implement those limits, and users can then operate within that framework to optimize what they want on a best-effort basis (eg a script that responds within a minute to explicitly scale resources, or a human-in-the-loop notification cycle over the course of hours so that you have the possibility to say "actually this is popularity traffic that I really do want to pay for, etc).
You can already do any of those things in your own code when making the API requests. The issue here is, if you unintentionally try to make a billion expensive requests or allow someone else to do it against your account, do you want them to automatically turn off your stuff or do you want the bill that comes if they don't?