logoalt Hacker News

paulatreidesyesterday at 10:55 PM6 repliesview on HN

it triggered for my.... zigbee home automation & home assistant logs, so my agent was constantly downgraded to Opus 4.8 even after I've changed it back. The false positives never stopped. "Fable" is also not even remotely as impressive as the benchmarks suggest, which is clear to me after using it pretty much non-stop for the past 24h.


Replies

lambdatoday at 1:36 AM

I suspect it's even more expensive to run than they are charging for. These safeguards are just an excuse to get people to use it less, because it's not actually sustainable to use. They want to tempt people to consider them the leader, and it may actually be somewhat stronger, but too expensive to actually use at scale, so they nerf it by downgrading you constantly.

reactordevyesterday at 10:58 PM

This, Fable is exactly that, a Fable

fluidcruftyesterday at 11:58 PM

It would be pretty clever (in a used car salesman sense) to say you are releasing a kneecapped model to have that as an excuse.

show 1 reply
kraakf06today at 1:23 AM

False positives like this are probably more damaging than the guardrails themselves. If engineers can't predict when a model will switch behavior, it becomes difficult to trust it in production workflows.

show 1 reply
NewsaHackOyesterday at 11:28 PM

It has to be sort of impressive, given that you tried so hard to use it instead of the regular Opus.

show 4 replies