logoalt Hacker News

lazideyesterday at 6:46 PM1 replyview on HN

LLMs don’t work in a predictably deterministic way that makes it easy to filter out these kinds of responses.

It’s gotten better, but it’s still typically pretty easy to bypass protections that are currently in place.


Replies

iugtmkbdfil834yesterday at 7:50 PM

I think parents point is why those protections need to be there at all.