Is there a way to replay the sequence of mails that came so that you can check out if cheaper models...

uHuge • today at 4:34 AM • 2 replies • view on HN

Is there a way to replay the sequence of mails that came so that you can check out if cheaper models handle them just as well/safely?

Replies

schobi • today at 5:28 AM

I'm surprised there are no security researchers that would pick up on this.

Take the same prompt and all incoming mails and run again through various existing models, even the simpler local ones. He now has a serious cross section of prompt injection ideas. This is a publication I would like to read!

For privacy reasons I understand the corpus might not get published. But for a research collaboration and safeguards (don't send automatic answers from each model you try)... why not?

croes • today at 4:52 AM

Or check if the results are the same even with the same model

alt Hacker News

Replies