compared to your test with GLM 5.1, this indeed looks off

throwaw12 • yesterday at 5:09 PM • 2 replies • view on HN

https://xcancel.com/simonw/status/2041646779553476801

Replies

Yeah GLM 5.1 did an outstanding job on the possum - better than Opus 4.7 or GPT-5.4 and I think better than Gemini 3.1 Pro too.

But GLM 5.1 is a 1.51TB model, the Qwen 3.6 I used here was 17GB - that's 1/88 the size.

➕ show 1 reply

refulgentis • yesterday at 5:18 PM

Hoping this doesn't turn into a pelican-SVG back-and-forth: yesterday's GPT Image 2 thread ended up being three screenfuls of "I tried the prompt too" replies, and nothing on the model until you scroll past it. I appreciate the testing, and I know this sounds like fun police, but there's a pattern where well-known commenter + one-off vibe test + 1:1 sub-threads eats the whole discussion. It being fun makes it hard to push back on without looking picky.

➕ show 1 reply

alt Hacker News

Replies