We're using small language models to detect prompt injection. Not too cool, but at least we can publish some AI-related stuff on the internet without a huge bill.
What kind of prompt injection attacks do you filter out? Have you tested with a prompt tuning framework?
What kind of prompt injection attacks do you filter out? Have you tested with a prompt tuning framework?