logoalt Hacker News

stingraycharlestoday at 2:13 AM0 repliesview on HN

I think the parent’s point is that this should be implemented using e.g. Bayesian statistics rather than an LLM, as the judge LLM is vulnerable to the exact same types of attacks that it’s trying to protect against.

Most proper LLM guardrails products use both.