logoalt Hacker News

bakugotoday at 4:28 PM0 repliesview on HN

I've heard that it was possible to trigger really obvious output poisoning on Fable with something as basic as asking the model to think outside of its built-in hidden thinking delimiters.

This watermark may trigger a similar mechanism.