Is there a good place for easy comparisons of different models? I know gpt-oss-20b and gpt-oss-120b ...

jareds • last Wednesday at 12:32 AM • 2 replies • view on HN

Is there a good place for easy comparisons of different models? I know gpt-oss-20b and gpt-oss-120b have different numbers of parameters, but don't know what this means in practice. All my experience with AI has been with larger models like Gemini and GPT. I'm interested in running models on my own hardware but don't know how small I can go and still get useful output both for simple things like fixing spelling and grammar, as well as complex things like programming.

Replies

ekidd • last Wednesday at 2:50 AM

One easy way to test different models is purchase $20 worth of tokens from one of the Open Router-like sites. This will let you asks tons of questions and try out lots of models.

Realistically, the biggest models you can run at a reasonable price right now are quantized versions of things like the Qwen3 30B A3B family. A 4-bit quantized version fits in roughly 15GB of RAM. This will run very nicely on something like an Nvidia 3090. But you can also use your regular RAM (though it will be slower).

These models aren't competitive with GPT 5 or Opus 4.5! But they're mostly all noticeably better than GPT-4o, some by quite a bit. Some of the 30B models will run as basic agentic coders.

There are also some great 4B to 8B models from various organizations that will fit on smaller systems. A 8B model, for example, can be a great translator.

(If you have a bunch of money and patience, you can also run something like GPT OSS 120B or GLM 4.5 Air locally.)

➕ show 3 replies

jdright • last Wednesday at 1:00 AM

https://swe-rebench.com/

alt Hacker News

Replies