“This beats the latest Sonnet while running locally” Not really. - The benchmarks are based on F...

WhitneyLand • yesterday at 8:11 PM • 3 replies • view on HN

“This beats the latest Sonnet while running locally”

Not really.

- The benchmarks are based on F8_E4M3 and you’re not running that on any Mac.

- Sonnet has a 1M token context window. This is 256k but again you’re probably not even getting that locally.

- Sonnet is fast over the wire. This is going to be much slower.

Replies

trvz • yesterday at 11:15 PM

> Sonnet is fast over the wire.

Except when it’s unavailable. For sovereignity, the downsides are worth it to some.

trueno • yesterday at 8:36 PM

the benchmarks we're using to measure llm's do no justice when everyone's mental-benchmark is simply "is it going to feel like using claude" and the answer is still no. the entire llm space is stuffed with tons of crazy datapoints and vernacular that barely paint the picture of the mental benchmark everyone is after.

i too am desperate to just sever ties with these big providers, my fingers are crossed we get there within the constraints of local hardware even if that means me spending 3-5k i just want off this wild ride.

varispeed • yesterday at 11:34 PM

Not sure if 1M token window is meaningful with Sonnet/Opus. The models go dumb quickly as context increases making them unusable (that is if you get routed to actual Opus, otherwise they are just dumb regardless of context window).

alt Hacker News

Replies