logoalt Hacker News

nickjjyesterday at 4:46 PM5 repliesview on HN

I don't use Copilot or any paid AI but all of this usage-based billing reminds me of cellphones back when you paid per individual text message.

Usage paying for AI is 1000x crazier because you're not even getting a guarantee in the thing you pay for in the end. You have to keep feeding it prompts and hope it gives you the solution you want. You may end up with no expected result yet you are paying for it. At least with texting, you got what you paid for.

I wonder how long it'll be before all AI costs are flat unlimited monthly fees or even free across the board, without compromise.


Replies

Lattyyesterday at 6:54 PM

I expect in the future we'll find out that someone in the industry was juicing the numbers with fake thinking tokens or something. The whole pricing model of charging you for the tokens it generates while not knowing how much it is going to generate going in has always been pretty crazy.

benoautoday at 1:10 AM

Internet usage when you billed by the hour but your connection was so slow it took a minute plus to load pages lol.

Sohcahtoa82yesterday at 6:32 PM

Yeah, this was my frustration with Suno and Sora. You can burn a lot of credits (not to mention time) generating things that aren't what you wanted.

I don't mind a PAYG model for a simple chat interface. But when it comes to actually producing things, you burn through TONS of tokens creating the wrong output.

tencentshillyesterday at 7:55 PM

It incentivizes you to do most of that prompting on your own hardware/time, and only feed the final prompt with only necessary context to the big AI in the sky. It might even force you to think about the problems yourself for a bit!

DaiPlusPlusyesterday at 6:43 PM

> I wonder how long it'll be before all AI costs are flat unlimited monthly fees or even free across the board, without compromise.

That's already the case if you can self-host an LLM; you don't even need a mythical H200: gamer-grade GeForce cards can get you a long way there (if this page is to be believed: https://www.runpod.io/gpu-compare/rtx-5090-vs-h200 )

...after RAM prices return to normalcy, of course - and then wait another 2 or 3 generations of GPU development for a 96GB HBM card to hit the streets - and also assuming SotA or cloud-only LLMs don't experience lifestyle-inflation, but I assume they must, because OpenAI/Anthropic/Etc's business-model depends on people paying them to access them, so it's in their interests to make it as difficult as possible to run them locally.

Give it 5 years from now and reassess.