logoalt Hacker News

kmkrworkstoday at 6:27 AM0 repliesview on HN

I don't feel like this is an optimal way of comparing models. I really don't think any metric as of now has the ability to list down the best model as of now. It prioritizes tasks over the overall ability, and I don't even think it's possible to.