logoalt Hacker News

retinarosyesterday at 10:39 PM1 replyview on HN

Every ai labs train on the test set. That is a big part of why we see benchmark climbing from 1% to 30% after a few models iterations


Replies

latentseatoday at 3:57 AM

Models themselves definitely aren't getting better.