> Again congratulations on your conspiracy theory.
I am neither impressed nor offended by any kind of argumentum ad hominem. I sincerely hope you have a wonderful day!
> Benchmarks are not PR they are designed by a variety of institutions completely outside the control of frontier labs.
I don't give a crap about how good a shovel may be in a theoretical experiment when it's digging in sand, when I work with hard earth.
The ones I had a look at are mostly absolutely meaningless to my actual work.
> and what you’re describing is just putting your trust in a very poor quality benchmark.
And here is where we disagree fundamentally, so we can leave it at that.
Ex falso quodlibet
> I don't give a crap about how good a shovel may be in a theoretical experiment when it's digging in sand, when I work with hard earth.
I don't know what this means, benchmark tasks are pretty hard and pretty in domain.
> The ones I had a look at are mostly absolutely meaningless to my actual work.
You've looked at 100,000 benchmarks?
> And here is where we disagree fundamentally, so we can leave it at that.
Yes we do disagree, yet one of us has statistics and rigor and one of us doesn't.