> We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours. By the time we reacted, costs were already around €28,000.
I had a similar experience with GCP where I set a budget of $100 and was only emailed 5 hours after exceeding the budget by which time I was well over it.
It's mind boggling that features like this aren't prioritized. Sure it would probably make Google less money short term, but surely that's more preferable to providing devs with such a poor experience that they'd never recommend your platform to anyone else again.
Considering the amount of repositories on public GitHub with hard-coded Gemini API tokens inside the shared source code (https://github.com/search?q=gemini+%22AIza%22&type=code), this hardly comes as a surprise. Google also has historically treated API keys as non-secrets, except with the introduction of the keys for LLM inference, then users are supposed to treat those secretly, but I'm not sure everyone got that memo yet.
Considering that the author didn't share what website this is about, I'd wager they either leaked it accidentally themselves via their frontend, or they've shared their source code with credentials together with it.
As others have said, this is a "feature" for Google, not a bug. There is no easy way to set a hard cap on billing on a project. I spent the better time of an hour trying to find it in the billing settings in GCP, only to land on reddit and figuring out that you could set a budget alert to trigger a Pub/Sub message, which triggers a Cloud Function to disable billing for the project. Insanity.
These are all poorly designed systems from a CX perspective (the billing systems).
Billing is usually event driven. Each spending instance (e.g. API call) generates an event.
Events go to queues/logs, aggregation is delayed.
You get alerts when aggregation happens, which if the aggregation service has a hiccup, can be many hours later (the service SLA and the billing aggregator SLA are different).
Even if you have hard limits, the limits trigger on the last known good aggregate, so a spike can make you overshoot the limit.
All of these protect the company, but not the customer.
If they really cared about customer experience, once a hard limit hits, that limit sets how much the customer pays until it is reset, period, regardless of any lags in billing event processing.
That pushes the incentive to build a good billing system. Any delays in aggregation potentially cost the provider money, so they will make it good (it's in their own best interest).
on a more positive note, you saved a few bucks not running your own server or database.
I read the following [0] and immediately went to my firebase project to downgrade my plan. This is horrific.
> Yes, I’m looking at a bill of $6,909 for calls to GenerativeLanguage.GenerateContent over about a month, none of which I made. I had quickly created an API key during a live Google training session. I never shared it with anyone and it’s not pushed to any public (or private) repo or website.
0 - https://discuss.ai.google.dev/t/unexpected-gemini-api-billin...
It is scary building on the public cloud as a solo dev or small team. No real safety net, possibly unbounded costs, etc. A large portion of each personal project I do is spent thinking about how to prevent unexpected costs, detect and limit them, and react to them. I used to just chuck everything onto a droplet or VPS, but a lot of the projects I am doing lately need services from Google or AWS. I tend to prefer GCP at this point because at least I can programmatically disconnect the billing account when they get around to tripping the alert.
Forgive my ignorance - but what's the payoff for fraudsters in getting access to a generative AI service for a short-ish period of time, before they get cut off?
With EC2 / GCC credentials, I could understand going all out on bitcoin mining - but what are they asking the AI to do here that's worth setting up some kind of botnet or automation to sift the internet for compromised keys?
Slightly off-topic, but Backblaze B2 has usage caps that actually work. I have $0 cap on API requests, and yesterday when litestream burned through the free tier (defaults to replicating every second), I got a notice and requests stopped working until I upped my cap.
We had this exact same problem (the key initially wasn’t a secret but became a secret once we enabled Gemini API with no warnings).
We managed to catch it somewhat early through alerting, so the damage was only $26k.
We asked our Google cloud support rep for a refund - they initially came back with a no but now the case is under further consideration.
I’d escalate this up the chain as much as possible.
Two things that should be default on any GCP project touching generative-AI APIs:
1 API-key restrictions by HTTP referrer AND by API (`generativelanguage.googleapis.com` only),
2 a billing budget with a Pub/Sub "cap" action, not just an email alert. Neither is on by default, and almost nobody sets them before shipping. 13 hours is actually fast for detection. most teams find out at end-of-month reconciliation.
Don’t use GCP (and other big clouds) until they sort out their safeguards.
All three of the big cloud subreddits have stories like this on a regular basis
> Are there recommended safeguards beyond ... moving calls server-side?
This implies the API calls originated in the client, suggesting the client may have had they API key.
I think the logistics of calculating cost in real time is something that is extremely hard. I don't think there is one big cloud service provider that has hard limits instead of alerts.
As long as they revert the charge when notified of scenarios like this , and they have historically done so for many cases, it's fine. It's an acceptable workaround for a hard problem and the cost of doing business ( just like Credit Cards accept a certain amount of loss to fraud as part of business)
The top comment on the post physically hurt me. We've moved past the era of keep env files in code bases and are now actually serving them lol.
Unfortunately, yet just another story like this. One of these unexpected usage charges in the thousands appears every month, and with the same automatic denied too. This is one of the reasons I just stopped using these kinds of pay-per-usage cloud services long ago. At best, I still use services that have hard-bounded usage limits, like EC2 from AWS, where one instance can never go beyond 24h/day usage and is always capped, with shutdowns when exceeded, and limited credit cards, too.
It's super frustrating that this is the only option to realistically deal with this issue, since all stories end up the same way: The cloud company just saying "f* you, we don't care, pay up." and legal fees are always expensive :(
Also, can't you tie a key to a domain or IP address to help stop unauthorized usage?
Can you pre-load money into your account and have that be used until it's zero, at which time you have to load more? Deepseek does it this way.
Take them to court.
As always, you will need to make lots of noise on here and similar channels visited by influential people so stuff can get actioned.
Leading tech companies in 2026, folks.
I thought the pricing model was meant to be a benefit of the cloud? All of a sudden, shock horror, paying by the minute turns out to be no cheaper and maybe even more expensive than just doing it yourself
That's fucking bonkers that nothing in the system could see this as unusual and worthy of throttling. The embarrassment of this -- that a company LITERALLY SELLING machine learning services and expertise -- cannot spot such a thing... This should have led them to deal with this internally and refund it. Just... Wow Google.
It's terrible that giant cloud providers such as Google or AWS doesn't allow for hard cap at project levels or prepaid. And that especially because alerts are delayed as author stated "We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours. By the time we reacted, costs were already around €28,000.".
I said this when this finding was originally posted and I'll say it again: This is by far the worst security incident Google has ever had, and that's why they aren't publicly or loudly responding to it. It's deeply embarrassing. They can't fix it without breaking customer workflows. They really, really want it to just go away and six months from now they'll complete their warning period to their enterprise contracts and then they can turn off this automated grant. Until then they want as few people to know about it as possible, and that means if you aren't on anyone's big & important customer list internally, and you missed the single 40px blurb they put on a buried developer documentation site, you're vulnerable and this will happen to you.
Disgusting behavior.
good
i have seen this so many times...
i'm thinking it's time we replaced api keys.
some type of real time crypto payment maybe?
With AI there is NO justification in NOT DOING IT BY YOURSELF. Why use firebase or <technology-x> if you can generate <the-thing> by yourself and deploy to hardware you own or rent.
> We had a budget alert (€80) and a cost anomaly alert, both of which triggered with a delay of a few hours
> By the time we reacted, costs were already around €28,000
> The final amount settled at €54,000+ due to delayed cost reporting
So much for the folks defending these three companies that refused to provide hard spending cap ("but you can set the budget", "you are doing it wrong if you worry about billing", "hard cap it's technically impossible" etc.)