logoalt Hacker News

AI's Economics Don't Make Sense

136 pointsby spkingtoday at 4:39 PM98 commentsview on HN

Comments

JohnMakintoday at 6:19 PM

I've sort of lost some respect for ed that I had early on in the hype cycle - he's still right about some things, but I can see him slowly and subtly retreating from his strong position, held even a few months ago, that these things will never ever be useful for anything and it's all a scam because they don't actually do anything at all except burn money. He would say it like 8 times a monologue. I remember one podcast maybe ~6 months ago he brought a developer skeptic on, and was trying to get him to say it wasn't actually useful for coding, and the dev was like "maybe not as advertised, but I definitely use it and it is useful to me" and he pivoted off the topic very quickly.

It seems he realizes he was wrong about that and has pivoted slowly to, "well, maybe they work sometimes, but the cost isn't justified." Which is a reasonable question! I just find his style of never admitting when he is wrong off putting and the way he presents things as absolute fact, when he's guessing like the rest of us. He was right about a lot, wrong about a lot, it's okay to admit that, I don't think his fan base would care.

show 5 replies
joshjob42today at 5:22 PM

There's a few major problems with the article. The most obvious is that frontier labs are not charging remotely close to the cost of tokens; afaik most estimate north of 80% profit margins. As a reference, providers are profitably providing Kimi K2.6 for $4/1Mtok out. Is that as good as Opus? No, but it's probably at least Sonnet level, so that's ~4x cheaper than Sonnet while still being profitable to serve on the margin. So you aren't plausibly getting into actual subsidization territory until you're over 5:1 sub to nameplate token costs.

How many tokens can you realistically burn through in one chat session? Opus and many other frontier models do maybe 60tok/s, less 250k/hr out. In you can use more, but in most cases cache is 5-10:1 cheaper than new input. Say you average 500ktok in, 90% cache, per request. That amounts to 100-150ktok in new input-equivalent costs, which in most cases is ~20-30ktok in output-equivalent costs. Do a request every minute, that's a total of about 1.5-2Mtok/hr. At API prices that's $50/hr for Opus, but really it probably only costs Anthropic $10/hr to serve that.

That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

So first party providers are not in a horrifying position or anything from a subsidization standpoint. The people in bad shape are Cursor and Perplexity, who don't have frontier models and are dependent on the open source community, which is typicly 6-12 months behind the frontier. They have to pay full freight API costs at 80% margin for the big boys to serve their harnesses, which is indeed untenable, and they'll have to either force users to use open source models and/or in house models they can serve at-cost or they will have to charge vastly more.

Gemini, Claude, and ChatGPT first-party services like Antigravity, Codex, and Claude Code are not in serious trouble though.

show 6 replies
gwbas1ctoday at 7:04 PM

What's the quote?

> Don't attribute to malice what can be attributed to incompetence.

We're currently used to SAAS billing models that are either all-you-can-eat subscriptions, or metered around some easy-to-understand metric like # of users, or otherwise number of gigabytes consumed.

The SAAS economics work that way because the compute consumed is typically too cheap to meter. Some customer uses a little more than average, some customer uses a little less than average; it's not worth the time to even it out to the penny.

AI is so darn CPU (GPU? AIPU?) intense that will only be profitable, and affordable, if it can be metered like electricity and billed with a small margin.

In SAAS, we're not used to metering billing computations this way.

milesvptoday at 5:37 PM

Reading this piece, I'm reminded of a podcast I heard some years ago where they were interviewing an early google marketing employee who was talking about the economics of google search. They said they'd done some surveys and concluded that they determined that the average user would get something like $20/year of value, and so that was the most they could realistically charge for search. Meanwhile, they could make something like $500/user in Q4 alone for advertising. So, of course, advertising.

I just don't think that LLM business models can survive the allure of advertising dollars, any more than Search could, or TV, or Radio, or Movies. Ignoring the talk of copilot putting ads into pull requests, there is just no way that publicly hosted LLMs will not end up inserting ads into the output.

This looks like what I remember. https://freakonomics.com/podcast/is-google-getting-worse/

show 1 reply
mitjamtoday at 7:14 PM

I would be curious to see a calculation backwards from TAM. Napkin: 50M developers worldwide (SlashData, 20M in China and India). If every developer had a $200/month subscription, that‘s $10B / Month. I think, many developers are expected to pay much more than that.

BosunoBtoday at 6:44 PM

All subscription models are subsidized by users who don't use much. The fact that somebody on a $20 sub might get $50 in value isn't crazy if there are 3 people who only get $10 in value. This isn't some sign that the model is broken, it's the intended outcome.

Also, I didn't read this whole thing, but I have yet to see Zitron respond to the strongest AI financials claim, which is that the models themselves are profitable on a life-cycle basis, even if the companies are not profitable on an annual basis due to capital expenditure. Dario made this claim exactly, and it more or less blows all of Zitron's financials arguments up.

show 3 replies
iooitoday at 5:29 PM

The entire basis of this article is that generating tokens is a variable cost and that that cost will not decrease over time.

> On an economic basis, a monthly subscription only makes sense with relatively static costs.

Running a data center is a fixed expense. Whether or not people use that data center to it's capacity doesn't change how much the operator pays (electricity use factors into this, since a GPU running at 100% will use more watts than an idle one, but it doesn't move the needle much on other fixed and variable costs of a data center).

> They also assumed, I imagine, that the cost of tokens would come down over time, versus what actually happened — while prices for some models might have come down, newer “reasoning” models burn way more tokens, which means the cost of inference has, somehow, gotten higher over time.

This is backwards. When the cost of something goes down, people use it more. This is basic supply and demand. Inference has gotten cheaper already, and will continue to do so.

Companies subsidizing costs for growth happens all the time. Yes, switching to usage-based pricing instead of subscriptions sucks for customers, but enterprises will continue to pay.

show 1 reply
pmdrtoday at 6:36 PM

I wonder how long until this post is flagged/"shadowbanned". Such was the fate of almost all of Ed's posts on HN, with little surprise as to why.

show 1 reply
lbritotoday at 5:15 PM

>At some point, the incredible, toxic burn-rate of generative AI is going to catch up with them, which in turn will lead to price increases, or companies releasing new products and features with wildly onerous rates (..) that will make even stalwart enterprise customers with budget to burn unable to justify the expense.

I pray this happens soon, but I feel I've been hearing some version of it for a while.

show 2 replies
chankstein38today at 7:02 PM

Before subscribing to Claude, I put $15 into my account so I could use it from Cline in VS Code. After less than a few hours I was out of money. This was basically just to get a simple project setup and a few 1000~ line (AI generated) code files edited. I have heard Cline is less ideal with token management but regardless, these services can easily cost us hundreds or thousands of dollars a month billed on usage. ($15x4hoursx2 for a work day = $30, $30x25 = $750). And that is assuming my very light usage here could even apply to a larger code base. My guess would be if I hooked it up to an enterprise project it'd skyrocket easily to $60+/day.

Glyptodontoday at 5:57 PM

I think there's another route this goes. At $7k a year or more per eng in token use, I think it's very reasonable to buy engineers machines with obscene GPUs and RAM and run models locally. And if it doesn't make sense now, someone will figure it out and save companies $10k+/eng over 3 years.

show 1 reply
wood_spirittoday at 5:10 PM

The general problem the average user has with a metered instead of provisioned billing model for computer services is you can’t easily control for cost overruns. From the old days customers getting stung for hosting costs when slashdotted or DOSed, to last decades microservice shock horror of the CI retry loop that burns money overnight to today’s AI that you basically have no idea how efficient the AI will be while it ponders your question, you are just setting yourself up for disappointment and cost overruns and a feeling that you’re not getting the value for money you got last week etc.

show 2 replies
threeptstoday at 5:53 PM

I thought this burning of cash was all an excuse for the exponential growth we saw in the last 6 years.

They went from GPT 2 a text only, goldfish-esque memory at a 8th grade reading level to what we have today, GPT 5, multimodality + a token window encompassing a enclyopedia and a Doctorate/Masters level of mastery in major subjects.

The economics are probably betting on this exponential growth to continue, which if it fails, the cash would burn.

bananamogultoday at 6:14 PM

The good news is that this might be the end of Oracle.

ameliaquiningtoday at 5:50 PM

As it happens, published just this morning is an article from Kelsey Piper that explains in some detail what's wrong with Zitron's takes: https://www.theargumentmag.com/p/ais-biggest-critic-has-lost...

show 2 replies
wonderwhyertoday at 5:04 PM

Yeah. And weird pricing seems like it's winding down.

It's interesting to compare it to electricity. Basically Anthropic was selling a flat fee electricity subscription, and when someone started connecting expensive washing machines (OpenClaw) to their subscriptions, instead of changing the pricing model, they banned washing machines...

I wonder if we will get to "electricity" style pricing for AI. What makes electricity predictable is relatively constant average usage over time + price is manageable. I'm just not buying electrical house heating and manage my electricity spending within some bounds.

With AI the problem is that we are only now getting to useful AI, and for now it's still too expensive to be useful, so they subsidize until they can stabilize at "cheap enough and smart enough" level. But it feels like that's still 2 years away while they are stopping to subsidize now. Will be interesting.

show 3 replies
mNovaktoday at 5:42 PM

Do we know the breakdown of revenue from API vs subscriptions for OAI/Anthropic? That seems very relevant, since this entire article seems to be on the premise that users are only willing to pay for a subsidized subscription and would never pay the 'true' token cost.

The internet seems to be saying that 70%+ of Anthropic revenue is per-token metered API, which would largely invalidate the article, but I can't find a solid source.

show 1 reply
matchagauchotoday at 5:51 PM

Same debate as the dot-com era.

Customer: “I don’t want to pay more than $100/mo for my website” Developer: “What are your goals?” Customer: “1M daily visits, 1,000 monthly signups.”

And we've spent the past 25 years offering serverless compute, auto-scaling, pay-as-you-go for AWS and Internet infrastructure. And the economics are still a hard sell.

ludicrousdisplatoday at 7:04 PM

Does this mean we can just go back to using software libraries?

cheeseblubbertoday at 5:25 PM

It make sense if you account for cost of intelligence getting cheaper every year. Most of the models per unit of intelligence is getting far cheaper. We get better hardware, architecture, training techniques, inference optimizations and caching. All those improvements add up. In in early 2022 you were getting 10x cheaper annually now is closer to 2x - 5x cheaper annually. The cost is still dropping where as Uber can only get the cost down by so much.

Ritewuttoday at 5:38 PM

It makes sense when you realize the goal is not the consumer but large gov and enterprise contracts.

putzdowntoday at 5:34 PM

The moves from “the subscription model for AI isn’t working given these parameters” to “a subscription model for AI can never work” to “the model was deliberately deceptive” to “it’s a fucking ripoff” is not logical. AI companies are feeling the need to get hold of spiraling costs by increasing prices and limitations. Inference hasn’t gotten cheap enough fast enough, and for some reason they feel they can’t wait longer. That doesn’t mean a subscription service can’t work: only that it will be expensive, maybe vastly so, and will need tiers based on usage with some fluidity for users to move between tiers in a given month. The model is something like HP’s “instant ink” service. Sure, there’s a question whether the moves companies are making now are worth the cost in the eyes of customers. But that’s a question of economics and timing, not a fundamental blow to monthly subscriptions as a model. The article doesn’t deal with these considerations fairly. It’s too much in the direction of a rant, with conspiracy theories thrown in.

OrvalWintermutetoday at 6:43 PM

I think the company Taalas alone destroys Ed’s arguments

Because, comparing vs GPUs

~16k–17k tokens/second per user

<1ms latency

10x power efficiency

20x cheaper production

Model to Si ~ 60 to 90 days

We have every reason to believe SW_to_Si will facilitate improving economics

christkvtoday at 5:36 PM

I'm just flabbergasted at the massive inefficient usage of tokens. What are people doing to spend 500 usd/day in tokens. I just don't understand what you could possibly be doing that would be not complete spagetti at the end if you run something in an autoloop.

show 3 replies
feverzsjtoday at 5:50 PM

It makes perfect sense, if you treat it as a Ponzi scheme.

[0]: https://www.wheresyoured.at/why-are-we-still-doing-this/

throwawayajnertoday at 5:40 PM

Zitron misunderstands the economics of models. Inference costs have dropped 99% in less than 2 years. Models are being commoditized faster than any technology in history.

A $20 subscription 2 years ago is not providing the same level of intelligence you're getting today.

Every major lab knows open source models are 6 months behind (See Google's "We have no moat") and none of them plan to make money on inference. Companies are subsidizing users to create moats that persist when models are essentially free for most everyday use.

show 1 reply
Marciplantoday at 5:29 PM

I am a paying subscriber to Ed Zitron and I enjoy his writing a lot. He should at some point admit that not everything is bullshit and there is definitely a business model to it. It is fun to read, though

show 2 replies
asahtoday at 5:08 PM

meh - by this logic, every new tech and startup ever is a "scam"

The truth is that the AI companies are gambling that inference cost will continue following a hyper version of Moore's Law, e.g. Google TurboQuant.

The countervailing thesis is that frontier models are consuming more and more compute.

The deepest truth: you often don't need a frontier model to get commercially acceptable results from AI. Thus, bring on the true pricing! and I'll just switch models to something financially sustainable.

show 1 reply
jcgrillotoday at 5:08 PM

The finding out phase has begun.