I see that with openai too, lots of responding to itself. Seems like a convenient way for them to churn tokens.
None of these companies have compute to spare. It’s not in their interest to use more tokens that necessary.
All the labs are in a cut throat race, with zero customer loyalty. As if they would intentionally degrade quality/speed for a petty cash grab.
This, so much this!
Pay by token(s) while token usage is totally intransparent is a super convenient money printing machinery.
A simpler explanation (esp. given the code we've seen from claude), is that they are vibecoding their own tools and moving fast and breaking things with predictably sloppy results.