How well do local OSS models stack up to Claude?
Very well for narrowly scoped purposes.
They decohere much faster as the context grows. Which is fine, or not, depending on whether you consider yourself a software engineer amplifying your output by automating the boilerplate, or an LLM cornac.
They don't, only on meaningless benchmarks.
Much better than they did half a year ago, but a single RTX 6000 won't get you there
Models in the 700B+ category (GLM5, Kimi K2.5) are decent, but running those on your own hardware is a six-figure investment. Realistic for a company, for a private person instead pick someone you like from openrouter's list of inference providers.
If you really want local on a realistic budget, Qwen 3.5 35B is ok. But not anywhere near Claude Opus