logoalt Hacker News

827atoday at 4:09 PM3 repliesview on HN

> Zitron's numbers don't tell us the real cost of generating tokens but, subject to the assumption that the platforms are not subsidizing the token price, that means Anthropic is subsidizing their enterprise customers by up to 40 times, and OpenAI up to 70 times

Neither Anthropic nor OpenAI are subsidizing enterprise customers. Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. Both organizations have moved to a "cheaper plan per user + API Pricing after that" (e.g. $20/mo + usage). The $100/$200/mo plans are for individuals only (of course, many individuals use these plans at work, but that's beside the point; they aren't selling this plan to enterprises).

> SemiAnalysis also analyzed the platform's gross margins, implausibly assuming that tokens were priced at 4 times the cost of generating them and: With the current subsidies, all it takes for a user to have a gross margin of at best negative 25% is for them to use as little as 25% of their rate limit.

The article's source for this claim is not SemiAnalysis; its Zitron. But once you dig through his article, Zitron links to a SemiAnalysis tweet [1] where they, as the paragraph states, implausibly assume gross margins of 75% to come up with their weird analysis of the subscription plans. Citing this for anything is weird, because afaik that 75% number is a total shot in the dark. We have no clue what their margins are. My take is that the only reason that 75% number is implausible is because it may underestimate the inference margins of Ant/OAI's API pricing.

[1] https://x.com/SemiAnalysis_/status/2064815045767213400?ref=w...


Replies

bayarearefugeetoday at 4:18 PM

> it may underestimate the inference margins of Ant/OAI's API pricing.

If true then why are neither Anthropic or OpenAI dropping their API pricing to gain market share when both are clearly doing all sorts of political and PR maneuvering to compete in a cutthroat market?

Since they aren't dropping the API usage prices (and are in fact raising them in a lot of subtle ways) then one of these options almost has to be true: they are still subsidizing inference, training costs are so ridiculously high that they need to make huge profits off inference or collapse in on themselves, or they are price fixing.

show 2 replies
minrawstoday at 4:25 PM

Given my experience with hosting these models at scale, working and optimizing load, I don't think the margins are nearly as high as 75% if the models are as big as people often claim.

Only reason deepseek is so cheap is because well I don't know, but actual pricing should be around their initial price which was 4x, at that price you have a healthy 25-50% margin based on occupancy, given the deepseek v4 is a very sparse moe model.

GLM 5.2 for example doesn't have more than 30-50% margins that's assuming old pricing for GPUs, current inflated GPU pricing well I am certain the margins must be lower. Ofc you can host for cheaper with quantization, and if you have very consistent capacity/utilization, which is not the norm with AI workloads.

Overall for large models like GPT 5.5 or Opus there must be healthier margins of around 50-70% assuming GPU pricing didn't increase for these companies. Even if it did 30-40% margin should be possible, even in worst case assuming all GPU they had saw a jump in pricing.

For smaller models it's hard to say, I would guess 20% but these models might be much smaller than I suspect, then it might be double that.

Note the issue is less intelligent tokens don't linearly scale down in memory usage, which is the biggest pain point of serving models. Context sizes have fucked us all.

Also anyone claiming OAI makes less margins on APIs or stuff might be wrong given they are on much lower context size, 1M context definitely is a lot more expensive to serve especially with smaller models like sonnet.

andrekandretoday at 5:26 PM

  > Neither Anthropic nor OpenAI allow Business nor Enterprise customers access to the high value $200/mo plan. 
they may not "allow" it, but i've seen first hand enterprises encourage employees to use these accounts personally and get reimbursed later to avoid pay-as-you-go w/limits pricing for users who do tokenmaxing as a cost control measure...