What do you mean about not doing evals? Just literally that you don’t run any benchmarks or do you h...

andy99 • last Tuesday at 11:44 PM • 2 replies • view on HN

What do you mean about not doing evals? Just literally that you don’t run any benchmarks or do you have something against them?

Replies

danielmarkbruce • yesterday at 5:41 AM

He's just saying anecdotally these models are good. A reasonable response might be "have you systematically evaluated them?". He has pre-answered - no.

woodson • yesterday at 12:46 AM

Not OP, but perhaps they mean not putting too much faith in common benchmarks (thanks to benchmaxxing).

alt Hacker News

Replies