Go to Github and look for model jailbreaks on NEW latest models. Try them out. You'll be surpri...

sciencejerk • today at 3:46 AM • 0 replies • view on HN

Go to Github and look for model jailbreaks on NEW latest models. Try them out. You'll be surprised by the results.

You're correct that it's gotten substantially harder to social engineer frontier models (I can only reliably do it to Opus <=4.6), but there are some techniques that seem to consistently work (hint: extremely large complex prompts, context with tons of malicious files mixed into ordinary context).

alt Hacker News