> Grok 3 performs at a level comparable to, and in some cases even exceeding, models from more mature labs like OpenAI, Google DeepMind, and Anthropic. It tops all categories in the LMSys arena and the reasoning version shows strong results—o3-level—in math,....
"Math"? Fields Medal level? Tenure? Ph.D.? ... high school plane geometry???
As in
'Grok 3 AI and Some Plane Geometry'
at
https://news.ycombinator.com/item?id=43113949
Grok 3 failed at a plane geometry exercise.