logoalt Hacker News

EduardoBautistayesterday at 9:23 PM1 replyview on HN

Yes, but how cheap is it to run four at the same time? It’s tough to run one good model locally, but running four at the same time which I commonly do with Claude and Codex just doesn’t seem to be happening anytime soon.


Replies

Aurornisyesterday at 10:44 PM

I'm referring to hosted models such as via OpenRouter or from the model providers' own services.

I think everyone making claims that inference is getting more expensive are unaware that there are more LLM providers than Google, Anthropic, and OpenAI.