logoalt Hacker News

cickotoday at 8:41 AM2 repliesview on HN

True. But there are other meanings of "free". I.e. nobody can say "from now on you no longer have access to model X because you're an asshole"


Replies

trollbridgetoday at 11:13 AM

Some obvious examples of why you'd want to spend the capital on this would be, for example, making some kind of autonomous system which needs to be periodically be offline, or you need complete confidentiality of what you're using the model for, etc.

To be cost effective with inference providers, you have to find some way to be using it 24/7.

Der_Einzigetoday at 2:28 PM

The ecosystem for inference is centralized around a few core projects, i.e. vLLM, sglang, and llamacpp.

If they decided to collude, they could absolutely say "from now on you no longer have access to model X because you're an asshole"

The commercial inference offering are also downstream of one of those 3 projects (or trt-LLM if they're nvidia). It would impact Ollama, and fireworks, together, and everyone else.

Don't tempt fate.