logoalt Hacker News

esperenttoday at 2:18 AM0 repliesview on HN

> A validator that checks "did the assistant reply?" instead of "was the reply correct?" was never a benchmark. It was a participation trophy

People can't even write a two paragraph comment without ai now