Two things that should be default on any GCP project touching generative-AI APIs:
1 API-key restrictions by HTTP referrer AND by API (`generativelanguage.googleapis.com` only),
2 a billing budget with a Pub/Sub "cap" action, not just an email alert. Neither is on by default, and almost nobody sets them before shipping. 13 hours is actually fast for detection. most teams find out at end-of-month reconciliation.
I want API keys with monthly and hourly quotas and RATE LIMITING.
like 50k requests per hour, above that 1/s/client up to 20 req/sec.
I don't want to shotgun my service for every user if one user is misbehaving. I want to set rate of bleeding