This only works against crude attacks which will fail the schema/canary check, but does next to nothing for semantic hijacking, memory poisoning and other more sophisticated techniques.
With misinformation attacks, your can instruct research agent to be skeptical and thoroughly validate claims made by untrusted sources. TBH, I think humans are just as likely to fall for these sorts of attacks if not more-so, because we're lazier than agents and less likely to do due diligence (when prompted).
With misinformation attacks, your can instruct research agent to be skeptical and thoroughly validate claims made by untrusted sources. TBH, I think humans are just as likely to fall for these sorts of attacks if not more-so, because we're lazier than agents and less likely to do due diligence (when prompted).