We can definitely make harder evals, the problem is a good eval set is indistinguishable from good t...

nikisweeting • yesterday at 9:48 PM • 0 replies • view on HN

We can definitely make harder evals, the problem is a good eval set is indistinguishable from good training data / market edge, so no one is incentivized to share their best eval sets publicly.

alt Hacker News