I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.
I mean, I am sure they don't mean it but they have the incentive to burn as much tokens as they are allowed to get away with. Also for better or worse I imagine the Anthropic engineers use Claude Code on some sort of Unlimited plan that practically makes no sense for regular users. So adding a 100k tokens is not a big deal.
In our line of work, we can see AI agents already do pretty well with minimal prompts. Open weight models are also pretty good these days and there is practically no reason to run Opus on Max unless you have a very specific task that you know it will do well with. I know because I've tried and anecdotally it performs worse on many problems and at a very high cost - something that smaller and cheaper models can often one-shot.
> I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.
It's because the subscriptions force you to do so. The subscriptions are the most economical way to use e.g. Claude by close to an order of magnitude. If you max out a 20x plan every week, doing the same work with the API would cost you well into the four figures.
Anyone already using the Claude API pricing and using CC over OpenCode is kneecapping themselves.
This is why the subscriptions are important. When the usage is (vaguely) unmetered, the provider has an incentive to make usage cheap on marginal use.
It aligns the incentives for faster, cheaper, terse and more reliable models, because the model providers pay the wasted tokens and electricity costs.
It makes perfect sense to me for an AI system to be vertically owned that way you can do vertical optimization.
>I am still baffled by the fact that we have collectively agreed to use agentic harnesses by the same companies that are selling access to their APIs.
the best performing and capable ones are all the ones that aren't tied to a specific api.
> adding a 100k tokens is not a big deal
Did you mean 100 billion tokens because 100k isn't a big deal at all?
no, they have incentive to charge as much as they want, butt they have massive costs / capacity constraints per token, if anything they have a major incentive to reduce them because they literally cannot meet demand.
They also have incentive to nerf models occasionally, so they rarely one shot the task and more often they do it wrong and then you have to spend on tokens to correct it. Bonus points if model suddenly goes completely dumb then you have to start the session over.
I don't think we've agreed to anything. That said I think paying for something like Claude Code makes a lot of sense because you can outsource the question of "how many tokens should I use per hour and how should I use them?" to the people providing the tokens.
If you want to plug your API keys into a third-party harness, that's totally cool and honestly, I'm looking into doing that right now and I haven't used any of the first-party harnesses at all. But the first time I accidentally spend $300 in a day I may be thinking about how a $20/month plan might be pretty good even if performance is inconsistent, at least I know what my costs are.