logoalt Hacker News

raesene9today at 9:24 AM0 repliesview on HN

The later Opus models (4.7/4.8), Sonnet 5, and particularly Fable 5 will refuse to do tasks related to offensive security.

One example I've hit is working on a benchmark of how well LLMs handle Kubernetes security tasks, there's a section on them exploiting security misconfigurations. Opus 4.6 was fine with that section, 4.7 and 4.8 saw some refusals and Fable point blank refused to do any of it.

The only other model I've seen refuse is OpenAI GPT-5.5, all the open weight models seem fine with it.

Ofc if you need to do that kind of work a lot you might be able to get on OpenAI/Anthropics allow-list for cyber work.