To be fair to OP, I just added this to our blog after their comment, in response to the correct criticisms that our text didn't make it clear how bad GPT-5.2's labels are.
LLMs have always been very subhuman at vision, and GPT-5.2 continues in this tradition, but it's still a big step up over GPT-5.1.
To be fair to OP, I just added this to our blog after their comment, in response to the correct criticisms that our text didn't make it clear how bad GPT-5.2's labels are.
LLMs have always been very subhuman at vision, and GPT-5.2 continues in this tradition, but it's still a big step up over GPT-5.1.
One way to get a sense of how bad LLMs are at vision is to watch them play Pokemon. E.g.,: https://www.lesswrong.com/posts/u6Lacc7wx4yYkBQ3r/insights-i...
They still very much struggle with basic vision tasks that adults, kids, and even animals can ace with little trouble.