I've invested significant time into getting open models to work, and investigating what works w...

nl • today at 1:47 AM • 0 replies • view on HN

I've invested significant time into getting open models to work, and investigating what works well.

The TL;DR is that unless you are doing it as a hobby or working in an environment where none of the data privacy options supported by Anthropic/OpenAI (including running on Azure/Bedrock with ZDR) work for you then it's not worth it.

The best open models are around the Sonnet 4.6 level. That's excellent, but the level of tasks you can give to GPT 5.4 or Opus 4.6 is just so much higher it doesn't compare (and Opus 4.7 seems noticeably better in my few hours of testing too).

I have my own benchmarks, but I like this much under-publicized OpenHands page: https://index.openhands.dev/home

It shows for every task they test closed models do the best. The closest and open model gets is Minmax 2.7 on issue resolution where it's ~1% worse than the leaders.

That matches my experience - fine for small problems, but well behind has the task gets bigger.

alt Hacker News