logoalt Hacker News

EMIRELADERO01/22/20251 replyview on HN

I mean... you can just firewall it?


Replies

slt202101/22/2025

you dont know which prompt activates the backdoor, how can you firewall it if you run the model in production?

show 2 replies