logoalt Hacker News

Der_Einzigetoday at 4:24 PM0 repliesview on HN

This is extremely obvious to anyone whose read other papers. There's tons of papers showing LLMs prefer their own outputs. It's a big enough problem that LLM-as-judge has to be a different LLM from the LLM you are testing in papers.