> Then that is also on me for using a tool that I can't control. That's a core trait ...

locknitpicker • today at 3:52 PM • 1 reply • view on HN

> Then that is also on me for using a tool that I can't control.

That's a core trait of LLMs.

Even the AI companies developing frontier models felt the need to put together whole test suites purposely designed to evaluate a model's propensity to try to subvert the user's intentions.

https://www.anthropic.com/research/shade-arena-sabotage-moni...

> Giving up control is a decision.

No, it is definitely not. Only recently did frontier models started to resort to generating ad-hoc scripts as makeshift tools. They even generate scripts to apply changes to source files.

Replies

BadBadJellyBean • today at 4:04 PM

You seem to misunderstand me. An LLM can only spit out text. It is the tooling I use that allows it to write scripts and call them. In my tooling it waits for me to accept changes, call scripts or other tools that might change something. I can make that deterministic. I know that it will stop and ask because it has no choice. If I want to be safer I give it no tools at all.

I can also just choose not to use an LLM. It is my choice to use them so it is my duty to keep myself safe. If I can't control that I'd be stupid to use them.

My take is that I probably can use LLMs safely when I don't let it run autonomously. There is a slight chance that the LLM will generate a string that will cause a bug in an MCP that will let the LLM do what it wants. That is the risk I am going to take and I will take the blame if it goes wrong.

alt Hacker News

Replies