logoalt Hacker News

motbus3today at 7:48 AM5 repliesview on HN

With no details, a bird told me of a project which estimated using several millions of tokens per day to automate a team work which got laid off. The operation is now a mess, there is no one willing to be considered liable and since the cheap model they used is about to be retired the company is going to see a 4x increase in price at least.

I have the feeling that the age of 'i can't be blamed by AI stuff' will be a "this was the computer guy mistake" for a moment.

PS. I've been using Claude opus 4.8 and it is worse than 4.6 and I will say that even sonnet 4.6 is better. PhD. Level of software and engineering I believe! I know many PhD who never coded or worked anyway


Replies

RamblingCTOtoday at 9:07 AM

Glad I'm not the only one. Almost every factual thing with new opus is wrong (and it now even happens with 4.6?). I asked it about car stuff yesterday and it totally misrepresented how a car axle even looks like fundamentally. Today I talked about my CV and it was just plain wrong. I don't know what happened, it wasn't like this a few weeks back and I'm even considering cancelling claude alltogether. GPT 5.5 for coding is fine and way more stable, but regular work is just broken.

user_7832today at 8:47 AM

On the topic of older (Claude) models being better... anyone knows anything close to 3.5 (or 3.6) era Sonnet? It was by far the best LLM I had ever asked my doubts too. It actually explained in a human way, not like some AI I need to re read thrice to understand.

(I've used modern Gemini 3.1 pro & claude too. Modern ChatGPT is just as useless, I've never heard a human speak in points. The human brain never encounters that irl.)

show 1 reply
prodigycorptoday at 8:27 AM

To me this is clearly a skill issue. Several millions of tokens per day is peanuts, even if uncached. gpt-5.5 is $5 per million of input tokens.

Anybody doing things seriously understand how to optimize their workflows for smaller models once they start to lock in processes.

show 1 reply
platinumradtoday at 8:19 AM

I don't doubt that the operation as a whole is a disaster, but they should be able to avoid the price increase by using one of the many other cheap models like DeepSeek V4 Flash right?