Inference speed is still slow in a meaningfully different way. The models are good, but not great, a...

datadrivenangel • yesterday at 8:28 PM • 1 reply • view on HN

Inference speed is still slow in a meaningfully different way. The models are good, but not great, and much slower, which for coding means a 2-3 minute task with claude code and opus takes an hour and has a higher chance of being wrong.

Replies

2ndorderthought • yesterday at 10:27 PM

It's only slow if you can't afford to run it properly. A lot of people are getting 70-100 tokens per second on 1 gpu.

Not sure what Claude opus or sonnet run at. I know when it goes offline it's 0 tokens per second

alt Hacker News

Replies