LLMs and LLM providers are massive black boxes. I get a lot of value from them and so I can put up w...

joshstrange • today at 6:32 PM • 20 replies • view on HN

LLMs and LLM providers are massive black boxes. I get a lot of value from them and so I can put up with that to a certain extent, but these new "products"/features that Anthropic are shipping are very unappealing to me. Not because I can't see a use-case for them, but because I have 0 trust in them:

- No trust that they won't nerf the tool/model behind the feature

- No trust they won't sunset the feature (the graveyard of LLM-features is vast and growing quickly while they throw stuff at the wall to see what sticks)

- No trust in the company long-term. Both in them being around at all and them not rug-pulling. I don't want to build on their "platform". I'll use their harness and their models but I don't want more lock-in than that.

If Anthropic goes "bad" I want to pick up and move to another harness and/or model with minimal fuss. Buying in to things like this would make that much harder.

I'm not going to build my business or my development flows on things I can't replicate myself. Also, I imagine debugging any of this would be maddening. The value add is just not there IMHO.

EDIT: Put another way, LLM companies are trying to climb the ladder to be a platform, I have zero interest in that, I was a "dumb pipe", I want a commodity, I want a provider, not a platform. Claude Code is as far into the dragon's lair that I want to venture and I'm only okay with that because I know I can jump to OpenCode/Codex/etc if/when Anthropic "goes bad".

Replies

freedomben • today at 9:59 PM

This echoes my thoughts exactly. I've tried to stay model-agnostic but the nudges and shoves from Anthropic continue to make that a challenge. No way I'm going that deep into their "cloud" services, unless it's a portable standard. I did MCP and skills because those were transferrable.

I also clearly see the lock-in/moat strategy playing out here, and I don't like it. It's classic SV tactics. I've been burned too many times to let it happen again if I can help it.

➕ show 1 reply

pc86 • today at 9:12 PM

> - No trust that they won't nerf the tool/model behind the feature

To the contrary, they've proven again and again and again they'll absolutely do that the first chance they get.

➕ show 1 reply

mikepurvis • today at 7:31 PM

> I want to pick up and move to another harness and/or model with minimal fuss. Buying in to things like this would make that much harder.

Yes, I expect that is very much the point here. A bunch of product guys got on a whiteboard and said, okay the thing is in wide use but the main moat is that our competitors are even more distrusted in the market than we are; other than that it's completely undifferentiated and can be swapped out in a heartbeat for multiple other offerings. How do we do we persuade our investors we have a locked in customer base that won't just up-stakes in favour of other options or just running open source models themselves?

➕ show 1 reply

JohnMakin • today at 8:26 PM

This is a similar sentiment I heard early on in the cloud adoption fever, many companies hedged by being “multi cloud” which ended up mostly being abandoned due to hostile patterns by cloud providers, and a lot of cost. Ultimately it didn’t really end up mattering and the most dire predictions of vendor lock in abuse didn’t really happen as feared (I know people will disagree with this, but specifically speaking about aws, the predictions vs what actually happened is a massive gap. note I have never and will never use azure, so I could be wrong on that particular one).

I see people making similar conclusions about various LLM providers. I suspect in the end it’ll shake out about the same way, the providers will become practically inoperable with each other either due to inconvenience, cost, or whatever. So I’ve not wasted much of my time thinking about it.

➕ show 4 replies

palata • today at 7:08 PM

> - No trust that they won't nerf the tool/model behind the feature

I actually trust that they will.

➕ show 2 replies

jeppester • today at 10:00 PM

I always hated SEO because it was not an exact science - like programming was.

Too bad we've now managed to turn programming into the same annoying guesswork.

gbro3n • today at 8:47 PM

I have heard it said that tokens will become commodities. I like being able to switch between Open AI and Anthropics models, but I feel I'd manage if one of them disappeared. I'd probably even get by with Gemini. I don't want to lock in to any one provider any more than I want to lock in to my energy provider. I might pay 2x for a better model, but no more, and I can see that not being the case for much longer.

ahmadyan • today at 8:15 PM

> I'm not going to build my business or my development flows on things I can't replicate myself.

but you can replicate these yourself! i'm happy that ant/oai are experimenting to find pmf for "llm for dev-tools". After they figure out the proper stickyness, (or if they go away or nerf or raise prices, etc) you can always take the off-ramp and implement your own llm/agent using the existing open-source models. The cost of building dev-tools is near zero. it is not like codegen where you need the frontier performance.

nine_k • today at 9:17 PM

In this regard, the release of open-weight Gemma models that can run on reasonable local hardware, and are not drastically worse than Anthropic flagships, is quite a punch. An M2 Mac Mini with 32GB is about 10 months worth of Claude Max subscription.

➕ show 1 reply

chinathrow • today at 6:34 PM

Yeah so better to convert tokens into sw doing the job at close to zero costs running on own systems.

cush • today at 7:22 PM

You could so easily build your own /schedule. This is hardly a feature driving lock-in

➕ show 1 reply

tiku • today at 8:02 PM

I believe it doesn't matter, other companies will copy or improve it. The same happend with clawdbot, the amount of clones in a month was insane.

SV_BubbleTime • today at 9:56 PM

Without getting too pedantic for no reason… I think it’s important to not call this an LLM.

This isn’t an LLM. It’s a product powered by an LLM. You don’t get access to the model you get access to the product.

An LLM can’t do a web search, an LLM can’t convert Excel files into something and then into PDF. Products do that.

I think it’s a mistake to say I don’t trust this engine to get me here, rather than it is to say I don’t trust this car. Because for the most part, the engine, despite giving you a different performance all the time is roughly doing the same thing over and over.

The product is the curious entity you have no control over.

wookmaster • today at 9:03 PM

They're trying to find ways to lock you in

sunnybeetroot • today at 7:45 PM

Isn’t that what LangChain/LangGraph is meant to solve? Write workflows/graphs and host them anywhere?

alfalfasprout • today at 10:06 PM

Yep. Trust is easy to lose, hard to earn. A nondeterministic black box that is likely buggy, will almost certainly change, and has a likelihood of getting enshittified is not a very good value proposition to build on top of or invest in.

Increasingly, we're also seeing the moat shrink somewhat. Frontier models are converging in performance (and I bet even Mythos will get matched) and harnesses are improving too across the board (OpenCode and Codex for example).

I get why they're trying to do that (a perception of a moat bloats the IPO price) but I have little faith there's any real moat at all (especially as competitors are still flush with cash).

➕ show 1 reply

slopinthebag • today at 8:48 PM

They have to become a platform because that is their only hope of locking in customers before the open models catch up enough to eat their lunch. Stuff like Gemma is already good enough to replace ChatGPT for the average consumer, and stuff like GLM 5.1 is not too far off from replacing Claude/Codex for the average developer.

verdverm • today at 6:37 PM

I fully endorse building a custom stack (1) because you will learn a lot (2) for full control and not having Big Ai define our UX/DX for this technology. Let's learn from history this time around?

➕ show 1 reply

andrewmcwatters • today at 6:46 PM

[dead]

crystal_revenge • today at 9:03 PM

This sounds like someone complaining about how Windows is a black box while ignoring the existence of Linux/BSD.

I'm currently hosting, on very reasonable consumer grade hardware, an LLM that is on par performance wise what every anyone was paying for about a year ago. Including all the layers in between the model and the user.

Llama.cpp serves up Gemma-4-26B-A4B, Open WebUI handles the client details: system prompt, web search, image gen, file uploading etc. With Conduit and Tailscale providing the last layer so I can have a mobile experience as robust as anything I get from Anthropic, plus I know how all the pieces works and can upgrade, enhance, etc to my hearts delight. All this runs from a pretty standard MBP at > 70 tokens/sec.

If you want to better understand the agent side of things, look into Hermes agent and you can start understanding the internals of how all this stuff is done. You can run a very competitive coding agent using modest hardware and open models. In a similar note, image/video gen on local hardware has come a long way.

Just like Linux, you're going to exchanging time for this level of control, but it's something anyone who takes LLMs seriously and has the same concerns can easily get started with.

Yet I still see comments like this that seem to complete ignore the incredible work in the open model community that has been perpetually improving and is starting to really be competitive. If you relax the "local" requirement and just want more performance from an LLM backend you can replace the llama.cpp part with a call to Kimi 2.5 or Minimax 2.7 (which you could feasibly run at home, not kimi though). You can still control all the additional part of the experience but run models that are very competitive with current proprietary SoTA offering, 100% under your control still and a fraction of the price.

alt Hacker News

Replies