My team and I are firm that we are the ones accountable. LLMs are a tool like every other. Only that...

BadBadJellyBean • today at 3:13 PM • 7 replies • view on HN

My team and I are firm that we are the ones accountable. LLMs are a tool like every other. Only that it's non deterministic. But I am the one using the tool. I am the one giving the tool access. I am the one who has to keep everything safe.

I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.

Letting LLMs work freely without supervision sounds great but it will lead to pain. I have to supervise their work. And that is also during execution. You can try to replace a human but we see where this leads. Sooner or later the LLM will do something stupid and then the only one to blame is the person who used the tool.

Replies

pjc50 • today at 3:58 PM

This is kind of the reverse of https://en.wikipedia.org/wiki/Poka-yoke . A lot of tools have affordances built in to make "right" things easy and "wrong" or unsafe things harder. LLMs .. well, the text interface is uniquely flat. Everything is seemingly as easy as everything else.

I worry about the use of humans as sacrificial accountability sinks. The "self-driving car" model already has this: a car which drives itself most of the time, but where a human user is required to be constantly alert so that the AI can transfer responsibility a few hundred miliseconds before the crash.

➕ show 6 replies

bombcar • today at 4:32 PM

> gparted wasn't to blame. I was.

These can both be true, especially if/when it has bad defaults. This is why you have things like "type the name of the database you're dropping" safety features - but you also have to name your production database something like "THE REAL DaTabaSe - FIRE ME" so you have to type that and not fall into the trap of ending up with the same name in test/development.

AI is particularly seductive because it sounds like a reasonable person has thought things out, but it's all just a giant confidence trick (that works most of the time, which makes it even more dangerous).

kokojambo • today at 4:07 PM

This is the right approach. I've been developing for 30 years and very much enjoy working with Ai. It's easy to see the Ai is just as good as the person using it. Deterministic or not, it's up for the dev to check the result (both code and behavior). I compare the anti-ai articles like the one saying "ai deleted my prod db" similar to factory workers rioting and complaining about machines replacing them. Ai makes a good developer better, the tech industry always attracted fakers that wanted a piece of the pie and now that these people have their hands on a powerful too and connect it to their prod db, they cry in pain and frustration. Like people with no license crashing a car and crying that cars are dangerous; They are but only because people use them dangerously.

fyrabanks • today at 3:40 PM

Thank you. Exactly this.

There were so many fundamental problems with the infrastructure even before the person gave a poor prompt to an agent.

If you're using the same API key for staging and prod--and just storing it somewhere randomly to forget about--you're setting yourself up for failure with or without AI.

lelanthran • today at 4:41 PM

> I have shot myself in the foot using gparted in the past by wiping the wrong disk. gparted wasn't to blame. I was.

Much like how a poor workman always blames his tools, people using poor tools always blame themselves.

I mean, Donald E Norman wrote The Philosophy of Everyday Things in the 80s!(Later became "The Design of Everyday Things")

And yet, today, we will still have a bunch of people defending Gnome's design decisions, or the latest design decisions from Apple, etc.

➕ show 1 reply

locknitpicker • today at 3:38 PM

> My team and I are firm that we are the ones accountable. LLMs are a tool like every other.

Except it is definitely not.

LLMs alone have highly non-deterministic even at a high-level, where they can even pursuit goals contrary to the user's prompts. Then, when introduced in ReAct-type loops and granted capabilities such as the ability to call tools then they are able to modify anything and perform all sorts of unexpected actions.

To make matters worse, nowadays models not only have the ability to call tools but also to generate code on the fly whatever ad-hoc script they want to run, which means that their capabilities are not limited to the software you have installed in your system.

This goes way beyond "regular tool" territory.

➕ show 3 replies

mystraline • today at 4:18 PM

> LLMs are a tool like every other. Only that it's non deterministic.

If you stay away from the corporate SaaS token vendors, and run your own, you will find LLMs are deterministic, purely based on the exact phrase on input. And as long as the context window's tokens are the same, you will get the same output.

The corporate vendors do tricks and swap models and play with inherent contexts from other chats. It makes one-shot questions annoying cause unrelated chats will creep into your context window.

➕ show 1 reply

alt Hacker News

Replies