The benchmark game is wholly gamed, but the proof is in the pudding. I know people using Anthropic,...

mikkupikku • yesterday at 7:20 PM • 1 reply • view on HN

The benchmark game is wholly gamed, but the proof is in the pudding. I know people using Anthropic, OpenAI, and Gemini. Chinese models locally. But who uses Grok for anything but porn? Whatever the benchmarks might say, Grok is just trash in practice. They spent too much time teaching it to be edgy and not enough time teaching it to code.

Replies

scottyah • yesterday at 11:37 PM

Ok, sounds like you're already mentally set

➕ show 1 reply

alt Hacker News

Replies