Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

444 points • by cmaster11 • today at 1:15 PM • 404 comments • view on HN

Comments

As an anecdote, I use the pro max 5x plan heavily for coding and have almost never hit a limit.

Are there local models dedicated to programming already any good? That could be a way to deal with anthropic or others flipflopping with token usage or limits

pawelduda • today at 2:59 PM

50 days ago I wrote this [1] as the world seemed high on AI and it gave me crypto bubble vibe.

Since then, I've been seeing increased critique of Anthropic in particular (several front page posts on HN, especially past few days), either due to it being nerfed or just straight up eating up usage quota (which matches my personal experience). It appears that we're once again getting hit by enshittiffication of sorts.

Nowadays I rely a lot on LLMs on a daily basis for architecture and writing code, but I'm so glad that majority of my experience came from pre-AI era.

If you use these tools, make sure you don't let it atrophy your software engineering "muscles". I'm positive that in long run LLMs are here to stay. The jump in what you can now self-host, or run on consumer hardware is huge, year after year. But if your abilities rely on one vendor, what happens if you come to work one day and find out you're locked out of your swiss army knife and you can no longer outsource thinking?

[1] https://news.ycombinator.com/item?id=47066701

gavinray • today at 2:12 PM

Codex is the only CLI I've had purely positive experiences with. Take that for what you will

➕ show 1 reply

algoth1 • today at 2:04 PM

Wasn't Antrophic previously offering double the token usage outside busy hours? Now they are counting tokens back at normal rate. But yeah, it's not good. I use codex because claude insists in peaking at and messing with folders and file outside its work area though

dr_dshiv • today at 2:32 PM

"Hey Claude, can you help me create a strategy to optimize my token use so I don't run into limits so often?" --> worked for me! I had two $200 plans before and now I am cool despite all day use

➕ show 1 reply

armchairhacker • today at 2:25 PM

Make an AI usage tracker like https://marginlab.ai/trackers/codex/. These hearsay anecdotes prove nothing.

bad_haircut72 • today at 2:18 PM

They also need to fix the 30 second lag between submitting the request and actually starting to get tokens back - it used to be instant, and still is at work where we use Anthropic models via an enterprise copilot subscription.

10keane • today at 1:55 PM

this same pattern seems to occur every time a new model is about to release. i didnt notice the usage problem - i am on 20x. but opus 4.6 feels siginificantly dumber for some reason. i cant qualitify it, but it failed on everyday tasks where it used to complete perfectly

➕ show 1 reply

Achshar • today at 1:53 PM

I feel like I am living in a bubble, no one seems to mention Antigravity in these discussions and I have not had any issues with Ultra subscription yet. It seems to go on forever and the Interface is so much better for dev work as compared to CC. (Though admittedly my experience with cc is limited).

I strongly believe google's legs will allow it to sustain this influx of compute and still not do the rug-pull like OAI or Anthropic will be forced to do as more people come onboard the code-gen use case.

docheinestages • today at 2:53 PM

Anthropic paved the path for agentic coding and their pricing made it possible for masses of people to discover and experiment with this new style of development. Their Claude Code plans subsidized usage of models so much that I'm sure they must've had negative margin for quite some time. But now that they have acquired a substantial user base, it makes sense for them to dial back and become more greedy. These quiet and weird changes to the behavior of Claude in the recent weeks must have been due to both this increased greed and their struggles with scaling.

What I wish for right now is for open-weight models and hardware companies (looking at you Apple) to make it possible to run local models with Opus 4.6-level intelligence.

@Anthropic I've cancelled my subscription. Good luck :)

niklasd • today at 2:20 PM

We also experienced hitting our Claude limits much earlier than before during the last two weeks. Up to a degree where we were thinking it must be a bug.

delduca • today at 2:48 PM

I noticed the same in last weeks. I canceled my Max 5X and subscribed to Copilot (with Opus 4.6).

It is hard now to hit the limit...

sdevonoes • today at 1:59 PM

I guess it’s better to step down now that we can rather than wait until it becomes impossible (Stockholm syndrome)

No FOMO

lforster • today at 1:46 PM

Lol imagine how much overcharging is going on for enterprise tokens. This is just the beginning.

gessha • today at 1:50 PM

I’m processing some images(custom board game images -> JSON) with a common layout and basic structure and I exhausted my quota after just 30 images(pleb Pro account). I have 700 images to process…

What I did instead is tune the prompt for gemma 4 26b and a 3090. Worked like a charm. Sometimes you have to run the main prompt and then a refinement prompt or split the processing into cases but it’s doable.

Now I’m waiting for anyone to put up some competition against NVIDIA so I can finally be able to afford a workstation GPU for a price less than a new kidney.

bit1993 • today at 1:57 PM

You know Emacs still works.

jandrese • today at 1:57 PM

I mean this is expected is it not? These companies burned unimaginable amounts of investor cash to get set up and now they have to start turning a profit. They can't make up for the difference with volume because the costs are high, so the only option is to raise prices.

softwaredoug • today at 2:38 PM

So glad I just pay by the token.

TheRealPomax • today at 2:40 PM

And in classic Anthropic fashion at this point, their issues appear to just be for show. No one triages them, no one responds to them.

peterpanhead • today at 2:01 PM

I don't understand Anthropic. Be consistent. Why do models deteriorate to shit, this is not good for workflows and or trust. What Opus 4.7 is gonna come out and again the same thing? Come on.

qwertyforce • today at 1:51 PM

thats exaclty why i prefer codex

dboreham • today at 3:37 PM

Random data point: I beat on Claude pretty much every day and have never run into limits of any kind.

stavros • today at 1:48 PM

It's crazy, a few weeks ago the limits would comfortably last me all week. This week, I've used up half the limit in a day.

tiahura • today at 1:48 PM

Also pro max 5x and hit quota for first time yesterday.

jedisct1 • today at 1:31 PM

GPT-5.4 works amazingly well.

I’ve moved away from Claude and toward open-source models plus a ChatGPT subscription.

That setup has worked really well for me: the subscription is generous, the API is flexible, and it fits nicely into my workflow. GPT-5.4 + Swival (https://swival.dev) are now my daily drivers.

➕ show 3 replies

spiderfarmer • today at 1:29 PM

That’s why I switched to Codex. It’s so much more generous and in my experience, just as good. Also, optimizing your setup for working with agents can easily make a 5x difference.

➕ show 3 replies

x86hacker1010 • today at 4:09 PM

Im sorry but I have to finally cancel, it’s gotten abysmal.

behole • today at 3:02 PM

I shred my Maxx5 in 2 hours on the reg this week! Glm here I come!

iLoveOncall • today at 2:48 PM

It's very easy to calculate the actual cost given they list the exact tokens used. If I take the AWS Bedrock pricing for Opus 4.6 1M context (because Anthropics APIs are subsidized and sold at a loss), here's what each costs:

Cache reads cost $0.31

Cache writes cost $105

Input tokens cost $0.04

Output tokens cost $28.75

The total spent in the session is $134.10, while the Pro Max 5x subscription is $100.

Even taking the Anthropics API pricing, we arrive at $80.58. Below the subscription price, but not by much.

It's just the end of the free tokens, nothing to see here. It's easy to feel like you're doing "moderate" or even "light" usage because you use so little input tokens, but those "agentic workflows" are simply not viable financially.

desireco42 • today at 2:34 PM

I don't use Claude so this doesn't affect me, but I worry it will spoil the fun for me for following reason.

They inflated how much their tools burn tokens from day one pretty much,remember all the stupid research and reports Claude always wanted to do, no matter what you asked it. Other tools are much smarter so this is not such a big deal.

More importantly, these moves tend to reverberate in the industry, so I expect others will clamp down on usage a lot and this will spoil my joy of using AI without countring every token.

Burning tokens doesn't just wastes your allotment, it also wastes your time. This gave rise to turbo offering where you get responses faster but burn 2x your tokens.

nprateem • today at 2:34 PM

I've seen ridiculously fast quota usage on antigravity too, where sometimes lots of work is possible, then it all goes literally within 4 questions.

Probably a combination of it being vibe coded shit and something in the backend I expect.

lvl155 • today at 1:42 PM

Constant complaints about Anthropic. Not much on OAI/Codex. It seems people should just use OAI and come back when they realize compute isn’t free elsewhere.

rdevilla • today at 1:34 PM

Bubble's bursting, get in.

➕ show 1 reply

mannanj • today at 1:34 PM

so basically the anthropic employee who responded says those 1h caches were writes were almost never accessed, so a silent 5m cache change is for our best interest and saves cost. (justifying why they did this silently)

however his response gaslights us because in the OPs opening post his math demonstrates this is not true, it shows reads 26x more so at least in his case the cache is not doing what the anthropic employee describes.

clearly we are being charged for less optimization here and being given the message (from my perspective by anthropic) that if you are in a special situation your needs don't matter and we will close your thread without really listening.

➕ show 3 replies

holoduke • today at 1:45 PM

I spend full 20x the week quota in less than 10 hours. How is that possible? Well try to mass translate texts in 30 languages and you will hit limits extremely quick.

➕ show 3 replies

bakugo • today at 2:10 PM

This is your regular friendly reminder that these subscriptions do not entitle you to any specific amount of usage. That "5x" is utterly meaningless because you don't know what it's 5x of.

This is by design, of course. Anyone who has been paying even the slightest bit of attention knows these subscriptions are not sustainable, and the prices will have to go up over time. Quietly reducing the usage limits that they were never specific about in the first place is much easier than raising the prices of the individual subscription tiers, with the same effect.

If you want to know what kind of prices you'll be paying to fuel your vibe coding addiction in a few years, try out API pricing for a bit, and try not to cry when your 100$ credit is gone in 2 days.

hadifrt20 • today at 1:21 PM

[dead]

rvz • today at 1:51 PM

Why so many 'developers' complaining about Claude rate limiting them? You know you can actually....use local LLMs? instead of donating your money to Anthropic's casino?

I guess this is fitting when the person who submitted the issue is in "AI | Crypto".

Well there's no crying at the casino when, you exhaust your usage or token limit.

The house (Anthropic) always wins.

➕ show 2 replies

vfalbor • today at 1:45 PM

Some months ago, I created a software for this reason, it has no success, but the thing is that communities could reduce tokens consumption, not all is LLM, you can share things from API calls between agents. Even my idea was no success I think it is a good concept share things each others, if you have some interest it's called tokenstree.com

alt Hacker News

Pro Max 5x quota exhausted in 1.5 hours despite moderate usage

Comments