logoalt Hacker News

karmakazetoday at 12:51 PM0 repliesview on HN

Not perfect, but I find the artificialanalysis.ai "Intelligence vs. Output Tokens Used in Artificial Analysis Intelligence Index" chart[0] (scroll down to the titled chart) to be of great use. A proper evaluation needs to compare 3 things together: score, speed, and verbosity. This chart plots score vs verbosity.

[0] https://artificialanalysis.ai/?models=gpt-oss-120b%2Cgemma-4...