Yes, but how cheap is it to run four at the same time? It’s tough to run one good model locally, but...

EduardoBautista • yesterday at 9:23 PM • 1 reply • view on HN

Yes, but how cheap is it to run four at the same time? It’s tough to run one good model locally, but running four at the same time which I commonly do with Claude and Codex just doesn’t seem to be happening anytime soon.

Replies

Aurornis • yesterday at 10:44 PM

I'm referring to hosted models such as via OpenRouter or from the model providers' own services.

I think everyone making claims that inference is getting more expensive are unaware that there are more LLM providers than Google, Anthropic, and OpenAI.

alt Hacker News

Replies