> Grok 3 performs at a level comparable to, and in some cases even exceeding, models from more ma...

graycat • last Thursday at 5:55 PM • 0 replies • view on HN

> Grok 3 performs at a level comparable to, and in some cases even exceeding, models from more mature labs like OpenAI, Google DeepMind, and Anthropic. It tops all categories in the LMSys arena and the reasoning version shows strong results—o3-level—in math,....

"Math"? Fields Medal level? Tenure? Ph.D.? ... high school plane geometry???

As in

'Grok 3 AI and Some Plane Geometry'

https://news.ycombinator.com/item?id=43113949

Grok 3 failed at a plane geometry exercise.

alt Hacker News