For months now I've seen benchmarks for lots of models that beat the pants off Claude 3.5 Sonne...

gman83 • 01/20/2025 • 2 replies • view on HN

For months now I've seen benchmarks for lots of models that beat the pants off Claude 3.5 Sonnet, but when I actually try to use those models (using Cline VSCode plugin) they never work as well as Claude for programming.

Replies

joshuacc • 01/20/2025

Part of that is that Claude is exceptionally good at turn-based interactions compared to other models that are better at one-shot reasoning.

raincole • 01/20/2025

After actually using DeepSeek-V3 for a while, the difference betwen it and Sonnet 3.5 is just glaring. My conclusion is that the hype around DeepSeek is either from 1) people who use LLM a lot more than a programmer can reasonably does so they're very price sensitive, like repackage service providers 2) astroturf.

➕ show 2 replies

alt Hacker News

Replies