> Easy for who?
Consider things from a different angle.
The hype men promoting the latest LLMs say the newest models produce PhD-level performance across a broad suite of benchmarks; some have even claimed that ChatGPT 4 is an early version of an AGI system that could become super-intelligent.
So the advertising teams have set the bar very high indeed. As smart as the smartest humans around, maybe smarter.
The bar they have set for themselves doesn't allow for any "oh but the tokenisation" excuses.
> The hype men promoting the latest LLMs say the newest models produce PhD-level performance across a broad suite of benchmarks; some have even claimed that ChatGPT 4 is an early version of an AGI system that could become super-intelligent.
Alright, why don't you go and discuss this with the people who say those things instead? No one made those points in this subthread, so not sure why they get brought up here.
Most human math phd's have all kinds of shortcomings. The idea that finding some "gotchas" shows that they are miles off the mark with the hype is absurd.