logoalt Hacker News

CuriouslyCtoday at 12:02 AM1 replyview on HN

The analytic pass doesn't need to be perfect, it just needs to be good enough at mitigating the injection that the model's alignment holds. If you just redact a few hot words in an injection and join suspect words with code chars rather than spaces, that disarms a lot of injections.


Replies

chrisjjtoday at 12:10 AM

Lets filter spam like its 1999! :)