> Good illustration that those guardrails are ineffective and trivial to bypass.
Is that genuinely surprising to anyone? The same applies to humans, really—if they don't see the full picture, and their individual contribution seems harmless, they will mostly do as told. Asking critical questions is a rare trait.
I would argue its completely futile to even work on guardrails, if defeating them is just a matter of reframing the task in an infinite number of ways.
> I would argue its completely futile to even work on guardrails
Maybe if humans were the only ones prompting AI models