That's objective metrics. Not an objective way to compare, which is the selection of metrics to...

jiggunjer • today at 6:47 AM • 1 reply • view on HN

That's objective metrics. Not an objective way to compare, which is the selection of metrics to include.

Replies

That's exactly why there's a ton of different benchmarking suites used for evaluating hardware performance.

I reckon we'll have similar suites comparing different aspects of models.

And, at some point, we'll be dealing with models skewing results whenever they detect they're being benchmarked, like it happened before with hardware. Some say that's already happening with the pelican test.

➕ show 1 reply

alt Hacker News

Replies