logoalt Hacker News

SchemaLoadtoday at 3:41 AM1 replyview on HN

Once the model has seen the questions and answers in the training stage, the questions are worthless. Only a test using previously unseen questions has merit.


Replies

lambdatoday at 3:46 AM

They aren't training new models for this. This is an agent harness for Opus 4.6.

show 1 reply