Could this be Goodhart's Law in action? AI tools like to showcase benchmarks in bar graphs to s...

gradientsrneat • 10/11/2024 • 0 replies • view on HN

Could this be Goodhart's Law in action? AI tools like to showcase benchmarks in bar graphs to show how well they perform compared to other models.

Maybe the benchmark Qs/As snuck into training sets accidentally. Is it still Goodhart's Law if it's unintentional?

Daniel Lemire has blogged about being impressed with how well the LLM answers his CS problem questions. I was impressed too. Not sure where the line of competence lies.

alt Hacker News