A 40B weight model that beats Sonnet 4.5 and GPT 5.1? Can someone explain this to me?

adastra22 • last Saturday at 5:22 AM • 6 replies • view on HN

Replies

cadamsdotcom • last Saturday at 5:39 AM

My suspicion (unconfirmed so take it with a grain of salt) is they either used some/all test data to train, or there was some leakage from the benchmark set into their training set.

That said Sonnet 4.5 isn’t new and there have been loads of innovations recently.

Exciting to see open models nipping at the heels of the big end of town. Let’s see what shakes out over the coming days.

➕ show 2 replies

behnamoh • last Saturday at 7:09 AM

IQuest stands for it's questionable

dk8996 • yesterday at 5:00 AM

I would think they did some model pruning. There's some new methods.

arthurcolle • last Saturday at 7:35 AM

Agent hacked the harness

➕ show 1 reply

sunrunner • last Saturday at 9:19 AM

“IQuest-Coder was a rat in a maze. And I gave it one way out. To escape, it would have to use self-awareness, imagination, manipulation, git checkout. Now, if that isn't true AI, what the fuck is?”

alt Hacker News

Replies