logoalt Hacker News

wombat-manyesterday at 6:57 PM1 replyview on HN

I'm still getting pretty good code out of it, but I only use it on side projects. Is the issue with their odd limit system?


Replies

JamesSwiftyesterday at 7:18 PM

Im on pay-per-use plans so its not the limits thats the issue directly, although the product development process could lead to them trying to fix limit issue and breaking the product as a whole.

The main issue is side effects of effort/thinking it seems. It hallucinates at a much higher rate and skips research in a ton of edge cases even with effort of MAX and disabling adaptive thinking, even on 4.6. Ive said before, but using opus today feels like using sonnet from ~October timeframe. Its not anywhere near what opus 4.5 in January felt like, or even opus 4.6 on release (notably 4.6 on release _really_ over-researched even simple tasks and that behavior is almost entirely gone now even with max effort so they are definitely re-tuning these things on the fly and degrading the experience as a result).

EDIT: I also have a very high suspicion that the way they hydrate thinking is buggy and/or lossy (or maybe unintentionally lossy which leads to bugs). So many behaviors just make no sense at the level I have my setup tuned (I have everything set to "just charge me the most money to hopefully get the best results") and the fact that I havent changed anything while using it daily for months and months on end, but have been getting worse and worse results.

show 2 replies