None of these companies have compute to spare. It’s not in their interest to use more tokens that necessary.
Not true - they absolutely want to goose demand as they continue to burn investor dollars and deploy infra at scale.
If that demand evens slows down in the slightest the whole bubble collapses.
Growth + Demand >> efficiency or $ spend at their current stage. Efficiency is a mature company/industry game.
That doesn’t mean they also can’t be wasteful. Fact is, Claude and gpt have way too much internal thinking about their system prompts than is needed. Every step they mention something around making sure they do xyz and not doing whatever. Why does it need to say things to itself like “great I have a plan now!” - that’s pure waste.
Are you saying these companies don't want to sell more product to us? Because that's the logical extension of your argument.
You don’t have to use compute to pad the token count.
Sure it is. They're well aware their product is a money furnace and they'd have to charge users a few orders of magnitude more just to break even, which is obviously not an option. So all that's left is.. convince users to burn tokens harder, so graphs go up, so they can bamboozle more investors into keeping the ship afloat for a bit longer.