logoalt Hacker News

iooitoday at 5:29 PM1 replyview on HN

The entire basis of this article is that generating tokens is a variable cost and that that cost will not decrease over time.

> On an economic basis, a monthly subscription only makes sense with relatively static costs.

Running a data center is a fixed expense. Whether or not people use that data center to it's capacity doesn't change how much the operator pays (electricity use factors into this, since a GPU running at 100% will use more watts than an idle one, but it doesn't move the needle much on other fixed and variable costs of a data center).

> They also assumed, I imagine, that the cost of tokens would come down over time, versus what actually happened — while prices for some models might have come down, newer “reasoning” models burn way more tokens, which means the cost of inference has, somehow, gotten higher over time.

This is backwards. When the cost of something goes down, people use it more. This is basic supply and demand. Inference has gotten cheaper already, and will continue to do so.

Companies subsidizing costs for growth happens all the time. Yes, switching to usage-based pricing instead of subscriptions sucks for customers, but enterprises will continue to pay.


Replies

xnxtoday at 5:41 PM

> it doesn't move the needle much on other fixed and variable costs of a data center

I wonder what the rough costs of a data center look like over the lifetime of one GPU generation?

10% building

60% GPU

30% power

I haven't gone looking for that information, but I haven't run across it either.