logoalt Hacker News

redman25today at 4:24 AM0 repliesview on HN

https://www.swebench.com

https://swe-rebench.com

https://livebench.ai/#/

https://eqbench.com/#

https://contextarena.ai/?needles=8

https://metr.org/blog/2025-03-19-measuring-ai-ability-to-com...

https://artificialanalysis.ai/leaderboards/models

https://gorilla.cs.berkeley.edu/leaderboard.html

https://github.com/lechmazur/confabulations

https://dubesor.de/benchtable

https://help.kagi.com/kagi/ai/llm-benchmark.html

https://huggingface.co/spaces/DontPlanToEnd/UGI-Leaderboard