Like identifying names of skateboard tricks from the description? https://skatebench.t3.gg/
I don’t care how practical it may or may not be, this is my new favorite LLM benchmark
I couldn't find an about page or similar?
o3-pro is better than 5.2 pro! And GPT 5 high is best. Really quite interesting.
I don’t care how practical it may or may not be, this is my new favorite LLM benchmark