There's a few major problems with the article. The most obvious is that frontier labs are not c...

joshjob42 • today at 5:22 PM • 6 replies • view on HN

There's a few major problems with the article. The most obvious is that frontier labs are not charging remotely close to the cost of tokens; afaik most estimate north of 80% profit margins. As a reference, providers are profitably providing Kimi K2.6 for $4/1Mtok out. Is that as good as Opus? No, but it's probably at least Sonnet level, so that's ~4x cheaper than Sonnet while still being profitable to serve on the margin. So you aren't plausibly getting into actual subsidization territory until you're over 5:1 sub to nameplate token costs.

How many tokens can you realistically burn through in one chat session? Opus and many other frontier models do maybe 60tok/s, less 250k/hr out. In you can use more, but in most cases cache is 5-10:1 cheaper than new input. Say you average 500ktok in, 90% cache, per request. That amounts to 100-150ktok in new input-equivalent costs, which in most cases is ~20-30ktok in output-equivalent costs. Do a request every minute, that's a total of about 1.5-2Mtok/hr. At API prices that's $50/hr for Opus, but really it probably only costs Anthropic $10/hr to serve that.

That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

So first party providers are not in a horrifying position or anything from a subsidization standpoint. The people in bad shape are Cursor and Perplexity, who don't have frontier models and are dependent on the open source community, which is typicly 6-12 months behind the frontier. They have to pay full freight API costs at 80% margin for the big boys to serve their harnesses, which is indeed untenable, and they'll have to either force users to use open source models and/or in house models they can serve at-cost or they will have to charge vastly more.

Gemini, Claude, and ChatGPT first-party services like Antigravity, Codex, and Claude Code are not in serious trouble though.

Replies

boelboel • today at 7:14 PM

Isn't this akin to saying Big Pharma companies could easily make money if they just stopped doing expensive research? The massive R&D spend is the core of the business plan; it's the only reason they can demand high prices in the first place. Once OpenAI stops spending billions on training, their pricing power vanishes because users will just migrate to Anthropic or whoever releases the next frontier model. Would imply there'd be space for only one to outlast them all in some sort of war of attrition (perhaps similar to silicon industry).

➕ show 1 reply

zozbot234 • today at 6:08 PM

It's not even a fixed cost per token (even though it's billed that way, and that's still miles better than a fixed-price all you can eat). You're incurring a cost that's proportional to generated tokens times the context for each (plus the prefill cost for any uncached input), so the expense grows quadratically with your average generated context.

This all becomes extremely visible when trying to do agentic coding with local language models - you quickly realize that controlling context length and model size is just as important as avoiding wasted effort. The real scam is not AI Q&A ala ChatGPT, that's actually quite viable - though marginally less so as conversations grow longer. It's agentic coding with SOTA models and huge contexts.

➕ show 1 reply

loeg • today at 5:57 PM

> How many tokens can you realistically burn through in one chat session?

I've used single digit billions in a couple days, FWIW.

➕ show 1 reply

intended • today at 6:20 PM

> afaik most estimate north of 80% profit margins

This seems to be the lynchpin of your argument.

It makes me wonder if I have been living under a rock, because I have never heard of frontier labs making money. AFAIK all AI firms are simply burning money to acquire customers at this stage. Is this wrong?

➕ show 3 replies

doctorpangloss • today at 6:07 PM

lots of words.

do you think per token prices will go up or down in the long term? will the price per task trend down or up?

what about the price of human labor?

➕ show 3 replies

ToucanLoucan • today at 5:36 PM

> That said, even if a developer is burning $50/hr, many, many employees at large companies cost more than $100k/yr to employ all costs considered, so making them say 20-30% more productive can easily make that worth it for most. If the labs shave their margins ultimately to more like 20-30%, you'd have ~$15/hr in costs to use the services, and nearly every white collar job is way over 30k/yr to employ. If your salary is 80k, you probably cost the company 200k all in, so making you 15% more productive offsets the $15/hr cost.

Nobody including the connected article is making the argument that this cannot be profitable ever. People are saying "there is no way this admittedly quite interesting tool is going to be able to make back all of this money" and I think they are completely right to say that.

You can absolutely make money with this stuff, just not at this scale. The buildout for this shit has been certifiably crazy and a number of the involved firms are overleveraged for tens and even hundreds of billions of dollars.

How in the sweet fuck are you paying that off, plus giving investors dividends, selling this at $15/hour/user??? That math does not math. A quick google says there are between 1.5 and 4.4 million developers in the US alone, let's say it's 5 million, to be generous, and each of them is subbed to this for 8 hours per day, continuously. That's 600 million per year in revenue. If you took ALL that revenue, and put it towards paying down this debt, not leaving any for employee salaries, upkeep, ongoing development, it would take DECADES to pay down what OpenAI already owes.

And yes I'm sticking directly to code, because that's the only thing I've seen it be really good at. Are we really proposing that every knowledge worker on earth and every manager of such workers is going to have an autonomous agent running all the time!? To do what, make sure they don't have to read or write email? Which even just that example is bringing in a fucking mess of legal, compliance, and security violations because LLMs are not intelligent and are not capable of being properly secured.

Like I'm sorry, I cannot take this industry seriously when even the most basic back-of-napkin math is saying, nay, screaming from the rooftops that they are FUCKED.

➕ show 4 replies

alt Hacker News

Replies