Does that mean we should use a larger model as judge for evals, not a smaller one?

ai_slop_hater • yesterday at 9:05 PM • 1 reply • view on HN

dist-epoch • yesterday at 9:54 PM

That was always the advice. Use the best model you can afford.

But some problems are easy and you can get away with a smaller model.

alt Hacker News