Qwen 3.7 Max: > During my local testing before the full eval harness it was the only non-GPT mode...

ikurei • today at 8:38 AM • 0 replies • view on HN

Qwen 3.7 Max: > During my local testing before the full eval harness it was the only non-GPT model that was able to complete the task, was not able to reproduce in the longer runs.

Doesn't that sound like may be the harness was the problem?

alt Hacker News