logoalt Hacker News

frank_nittiyesterday at 8:41 PM5 repliesview on HN

Interesting to see your observation where I have observed the opposite: posts that share big news about open-weight local models have many upvoted comments arguing local models shouldn’t be taken seriously and promoting the SOTA commercial models as the only viable options for serious developers.

Here and on AI tech subreddits (ones that aren’t specifically about local or FOSS) seem to have this dynamic, to the degree I’ve suspected astroturfing.

So it’s refreshing to see maybe that’s just a coincidence or confirmation bias on my end.


Replies

lxgryesterday at 9:27 PM

Both can be true at the same time. I currently wouldn't waste my time with open models for almost all use cases, but they're crucial from a data privacy and competitive perspective, and I can't wait for them to catch up enough to be as useful as the current frontier models.

show 1 reply
nltoday at 1:47 AM

I've invested significant time into getting open models to work, and investigating what works well.

The TL;DR is that unless you are doing it as a hobby or working in an environment where none of the data privacy options supported by Anthropic/OpenAI (including running on Azure/Bedrock with ZDR) work for you then it's not worth it.

The best open models are around the Sonnet 4.6 level. That's excellent, but the level of tasks you can give to GPT 5.4 or Opus 4.6 is just so much higher it doesn't compare (and Opus 4.7 seems noticeably better in my few hours of testing too).

I have my own benchmarks, but I like this much under-publicized OpenHands page: https://index.openhands.dev/home

It shows for every task they test closed models do the best. The closest and open model gets is Minmax 2.7 on issue resolution where it's ~1% worse than the leaders.

That matches my experience - fine for small problems, but well behind has the task gets bigger.

bloppeyesterday at 9:10 PM

I'm just waiting till I can afford a GPU again

echelontoday at 1:48 AM

> Interesting to see your observation where I have observed the opposite: posts that share big news about open-weight local models have many upvoted comments arguing local models shouldn’t be taken seriously and promoting the SOTA commercial models as the only viable options for serious developers.

When I argue this, my point is that FOSS shouldn't target the desktop with open weights - it should target H200s. Really big parameter models with big VRAM requirements.

Those can always be distilled down, but you can't really go the other way.