logoalt Hacker News

hu3today at 3:49 AM1 replyview on HN

Open weight models are neat.

But for SOTA performance you need specialized hardware. Even for Open Weight models.

40k in consumer hardware is never going to compete with 40k of AI specialized GPUs/servers.

Your link starts with:

> "Using a single top-of-the-line gaming GPU like NVIDIA’s RTX 5090 (under $2500), anyone can locally run models matching the absolute frontier of LLM performance from just 6 to 12 months ago."

I highly doubt a RTX 5090 can run anything that competes with Sonnet 3.5 which was released June, 2024.


Replies

Lapel2742today at 9:58 AM

> I highly doubt a RTX 5090 can run anything that competes with Sonnet 3.5 which was released June, 2024.

I don't know about the capabilities of a 5090 but you probably can run a Devstral-2 [1] model locally on a Mac with good performance. Even the small Devstral-2 model (24b) seems to easily beat Sonnet 3.5 [2]. My impression is that local models have made huge progress.

Coding aside I'm also impressed by the Ministral models (3b, 8b and 14b) Mistral AI released a a couple of weeks ago. The Granite 4.0 models by IBM also seem capable in this context.

[1] https://mistral.ai/news/devstral-2-vibe-cli

[2] https://www.anthropic.com/news/swe-bench-sonnet