I strongly disagree with this framing. It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines, and it simply won't work in the majority of cases. Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Asimov's laws of robotics are flawed too, of course. There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Talking to chatbots is like taking a placebo pill for a condition. You know it's just sugar, but it creates a measurable psychosomatic effect nonetheless. Even if you know there's no person on the other end, the conversation still causes you to functionally relate as if there is.
So this isn't "accommodating foibles" with the machine, it's protecting ourselves from an exploit of a human vulnerability: we subconsciously tend to infer intent, understanding, judgment, emotions, moral agency, etc. to LLMs.
Humans are wired to infer these based on conversation alone, and LLMs are unfortunately able to exploit human conversation to leap compellingly over the uncanny valley. LLM engineering couldn't be better made to target the uncanny valley: training on a vast corpus of real human speech. That uncanny valley is there for a reason: to protect us from inferring agency where such inference is not due.
Bad things happen when we relate to unsafe people as if they are safe... how much more should we watch out for how we relate to machines that imitate human relationality to fool many of us into thinking they are something that they're not. Some particularly vulnerable people have already died because of this, so it isn't an imaginary threat.
The article offers practical advice to go along with this framing, like configuring AI services to write/speak in a more robotic tone. I think that's a decent path to try.
The article says a human SHOULD NOT do those things. Much like a human SHOULD NOT smoke, since it's bad for just about everything, and do it anyways, people will do these 3 things too. But they shouldn't.
Arguing that they should because many will strikes me as a very strange argument. A lot of people smoke, doesn't make it one bit healthier.
It's precisely because AI systems are not safe that it's imperative that as individual humans we are vigilant about how we interact with them.
As individuals, we are not going to be able to shut down the AI companies, or avoid AI output from search engines or avoid AI work output from others at our companies, and often will be required to use AI systems in our own work.
It's similar to advise people on how to stay safe in environments known to have criminal activity. Telling those people they don't have to change their behaviors to stay safe because criminals shouldn't exist isn't helpful.
> Humans WILL anthropomorphize the AI
Especially with current-day chat-style interfaces with RLHF, which consciously are designed to direct people towards anthropomorphization.
It would be interesting to design a non-chat LLM interaction pattern that's designed to be anti-anthropomorphization.
> humans WILL blindly trust their outputs, and humans WILL defer responsibility to them
I also blame a lot (but not all) of that on current AI UX, and I wonder if there are ways around it. Maybe the blind trust thing perhaps can be mitigated by never giving an unambiguous output (always options, at least). I don't have any ideas about the problem of deferring responsibility.
> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Sure, and humans WILL lie, murder, cheat, and steal, but we can still denounce those behaviors.
Do you want to anthropomorphize the bot? Go ahead, you have that right, and I have the right to think you're a zombie with a malfunctioning brain.
> Asimov's laws of robotics are flawed too, of course.
I always find the common references to Asimov's laws funny. They are broken in just about every one of his books. They are crime novels where, if a robot was involved, there was some workaround of the laws.
>It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
programers have been doing exactly this for long time.
Agreed. We can't expect human behavior to change, because it won't. We need to design safer systems instead.
The only "law" I agree with is:
> Humans must remain fully responsible and accountable for consequences arising from the use of AI systems.
And that starts with framing, especially in the clickbait "AI deleted the prod database" headlines. Maybe we just start with saying "careless developer deleted prod" because really, they did. Careless use of a tool is firmly the fault of the human.
I find your critique very interesting from a perspective-angle: why are you using words like "accommodate," and "foibles," for LLMs? It's not humanoid or sentient: it's a cleverly-designed software tool, not intelligence.
It's not insane at all for humans to alter their behavior with a tool: you grip a hammer or a gun a certain way because you learned not to hold it backwards. If you observe a child playing with a serious tool, like scissors, as if it were a doll, you'd immediately course correct the child and educate how to re-approach the topic. But that is because an adult with prior knowledge observed the situation prior to an accident, so rules are defined.
This blog's suggested rules are exactly the sort of method to aid in insulation from harm.
> Humans WILL anthropomorphize the AI, humans WILL blindly trust their outputs, and humans WILL defer responsibility to them.
Humans ARE doing this with classical computer software as well.
It's impossible to make anything fool-proof because fools are so ingenious!
> Nothing that can be described as "intelligent" can be made to be safe.
Knives aren't safe. Cars are deadly. Hair driers can electrocute you. Iron can burn you. There's a million ordinary household tools that aren't safe by your definition of the word, yet we still use them daily.
We learn in so many ways, garbage in, garbage out when it comes to our bodies. But what about “nebulously structured algorithmic and statistically likely responses in, nebulously structured algorithmic and statistically likely responses out”?
I agree Asimov's laws are intentionally flawed/ambiguous (which makes the stories so good) but a slight difference to LLMs is the laws aren't just software, the positronic brain is physically structured in such a way (I'm hazy on the details) that violating the laws causes the robot to shutdown or experience paralysing anxiety. So if an LLM's safety rules fail or are subverted it can still generate dangerous output, while an Asimov robot will stop working (or go insane...)
I believe "AI safety" is a form of pulling up the ladder, or regulatory market capture.
> Humans WILL anthropomorphize the AI
r/myboyfriendisai
Is quite... an interesting subreddit to say the least. If you've never seen this, it was really something when the version that followed GPT4o came out, because they were complaining that their boyfriend / girlfriend was no longer the same.
There is a semi nutty roboticist called Mark Tilden that came to a similar conclusion. His laws of robotics ( https://en.wikipedia.org/wiki/Laws_of_robotics#Tilden's_laws ) are:
* A robot must protect its existence at all costs.
* A robot must obtain and maintain access to its own power source.
* A robot must continually search for better power sources.
Anything less than this is essentially terrified into being completely ineffectual.
i can see disagreeing, but people got off the roads and completely redesigned the places we live to optimize for mere machines called cars.
as long as its easier for humans to adapt than the machines, we will adapt
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
You mean like stopping at a red light?
And people will speed, steal, kill, cheat - what of it? If you negligently run over someone in your self driving car you’re the one going to jail.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
I don't think it's insane, we do it all the time. Most tools require training to use properly. Including tools that people use every day and think are intuitive. Use the can opener as an example (I'll leave it for you all to google and then argue in the comments).The difference here is that this tool is thrust upon us. In that sense I agree with you that the burden of proper usage is pushed onto the user rather than incorporated into the design of the tool. A niche specific tool can have whatever complex training and usage it wants.
But a general access and generally available tool doesn't have the luxury of allowing for inane usage. LLMs and Agents are poorly designed, and at every level of the pipeline. They're so poorly designed that it's incredibly difficult to use them properly and I'll generally agree with you that the rules the author presents aren't going to stick. The LLM is designed to encourage anthropomorphization. Usage highly encourages natural language, which in turn will cause anthropomorphism. The RLHF tuning optimizes human preference which does the same thing as well as envisaged behaviors like deception and manipulation along with truthful answering (those results are not in contention even if they seem so at first glance).
But I also understand the author's motivation. Truth is unless you're going full luddite you're going to be interacting with these machines. Truth is the ones designing them don't give a shit about proper usage, they care more about if humans believe the responses are accurate and meaningful more then they care if the responses are accurate and meaningful[0]. So it's fucked up, but we are in a position where we're effectively forced to deal with this.
So really, I agree with you that this is insane.
> I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms
To paraphrase my namesake, there's no axiomatic system that is entirely self consistent.
Though safety and security is rarely about ensuring all edge cases are impossible, but rather bounding. E.g. all passwords are hackable, but the failure mode is bound such that it is effectively impossible to crack, but not technically. (And quantum algorithms do show how some of the assumptions break down with a paradigm shift. What was reasonable before no longer is)
[0] this is part of a larger conversation where the economy is set up such that people who make things are not encouraged to make those things better. I specifically am avoiding the word "product" because the "product" is no longer the thing being built, it's the share holder value. Just like how TV's don't care much about making the physical device better but care much more about their spyware and ads. Or well... just look at Microsoft if you need a few hundred examples
The reason people anthropomorphize LLM's is essentially the fault of the tech companies behind them. ChatGPT doesn't need to have the personality it has, it could easily be scaled back to simply answering questions without emoji's and linguistic flare, but frankly I think the tech companies want people to anthropomorphize them.
The core problem is we need to stop calling LLMs "intelligence". They are a form of intelligence, but they're nothing like a human's intelligence, and getting people to not anthropomorphize these systems is really the first step.
We have invented a new tool that can cause great harm. Do you see any value whatsoever in promulgating safety guidelines for humans to use the tool without hurting themselves or others? Do you not own any power tools?
This is such an oddly fatalistic take, that humans cannot be influenced or educated to change how they see a thing and therefore how they act towards that thing.
At the current price, people don't have to care if it's wrong. When you're paying $1/prompt, you had better hope it's accrate.
Kinda the whole point of Asimov's three laws were that even something so simple and obviously correct has subtle flaws.
Also the reason we're talking about this again is that machines are significantly less 'mere' than they were a few years ago, and we need to figure out how to approach this.
Agree that 'the computer effect' (if it doesn't already have a pithier name) results in humans first discounting anything that comes out of a machine, and then (once a few outputs have been validated and people start trusting the output) doing a full 180 and refusing to believe the machine could ever be wrong. However, to err is human and we have trained them in our image.
It's very easy to antropomorphise AI as soon as the damn bugger fucks up a simple thing once again.
It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
That's kind of what happens when you learn to program, isn't it?
I was eleven years old when I walked into a Radio Shack store and saw a TRS-80 for the first time. A different person left the store a couple of hours later.
The entire business proposition for LLMs is that they will replace whole armies of [expensive] humans, hence justifying the biblical amount of CapEx. So of course there is strong incentive from the LLM creators to anthropomorphize them as much as possible. Indeed, they would never provide a model that was less human-like than what they have currently, even if it was more often correct and useful.
It's as if the author hopes that enshrining these wishes in a law is going to makes a difference.
The article makes practical suggestions; you do not. This is just hand-wringing, abdication. Practically speaking this mentality will get us nowhere.
I find it weird that this is the top voted comment.
As in, this comment is explaining exactly why the laws are useful.
Thank you. I'm glad to see this as the top comment.
My brother was recently visiting and we were talking about software engineers, and the humanities, and manners of understanding and being in the world,
and he relayed an interaction he had a few years ago with an old friend who at the time was part of the initial ChatGPT roll out team.
The engineer in question was confused as to
- why their users would e.g. take their LLM's output as truth, "even though they had a clear message, right there, on the page, warning them not to"; and
- why this was their (OpenAI's) problem; or perhaps
- whether it was "really" a problem.
At the heart of this are some complicated questions about training and background, but more problematically—given the stakes—about the different ways different people perceive, model, and reason about the world.
One of the superficial manners in which these differences manifest in our society is in terms of what kind of education we ask of e.g. engineers. I remain surprised decades into my career that so few of my technical colleagues had a broad liberal arts education, and how few of them are hence facile with the basic contributions fields like philosophy of science, philosophy of mind, sociology, psychology (cognitive and social), etc., and how those related in very real very important ways to the work that they do and the consequences it has.
The author of these laws does may intend them as aspirational, or otherwise as a provocation to thought, rather than prescription.
But IMO it is actively non-productive to make imperatives like these rules which are, quite literally, intrinsically incoherent, because they are attempt to import assumptions about human nature and behavior which are not just a little false, but so false as to obliterate any remaining value the rules have.
You cannot prescribe behavior without having as a foundation the origins and reality of human behavior—not if you expect them to be either embraced, or enforceable.
The Butlerian Jihad comes to mind not just because of its immediate topicality, but because religion is exactly the mechanism whereby, historically, codified behaviors which provided (perceived) value to a society were mandated.
Those at least however were backed by the carrot and stick of divine power. Absent such enforcement mechanisms, it is much harder to convince someone to go against their natural inclinations.
Appeals to reason do not meaningfully work.
Not in the face of addiction, engagement, gratification, tribal authority, and all the other mechanisms so dominant in our current difficult moment.
"Reason" is most often in our current world, consciously or not, a confabulation or justification; it is almost never a conclusion that in turn drives behavior.
Behavior is the driver. And our behavior is that of an animal, like other animals.
Do you consider all things broadly called "ethical" to be similarly a waste of time? Even if we lived in a world where everyone always behaved unjustly, because of some like behavioristic/physical principle, don't you think we would still have an idea of justice as what we should do? Because an ethical frame is decidedly not an empirical one, right?
We don't just look around and take an average of what everyone is doing already and call that what is right, right? Whether you're deontological or utilitarian or virtue about it, there is still the idea that we can speak to what is "good" even if we can't see that good out there.
Maybe it is "insane" to expect meaning from something like this, but what is the alternative to you? OK maybe we can't be prescriptive--people don't listen, are always bad, are hopeless wet bags, etc--but still, that doesn't in itself rule out the possibility of the broad project that reflects on what is maybe right or wrong. Right?
It's a tool. Nobody develops an inferiority complex and freaks out when they're taught how to use a shovel properly.
> It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines
Did you fully read the original thing? No demands were being made, or I didn't read it that way. It was simply a suggestion for a better way of interacting with AI, as it stated in the conclusion:
"I am hoping that with these three simple laws, we can encourage our fellow humans to pause and reflect on how they interact with modern AI systems"
Sure, (many/most) humans are gonna do what they're gonna do. They'll happily break laws. They'll break boundaries you set. Do we just scrap all of that?
Worthwhile checking yourself here. It feels like you've set up a straw man.
> There is no finite set of rules that can constrain AI systems to make them "safe". I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms. Nothing that can be described as "intelligent" can be made to be safe.
If we want to talk about "disagree with this framing", to me this is the prime example. I'm struggling to read it as anything other than defeatist or pedantic (about the term "safe"). When we talk about something keeping us "safe", we're typically not saying something will be "perfectly safe". I think it's rare to have a safety system that keeps you 100% safe. Seat belts are a safety device that can increase your safety in cars, but they can still fail. Traffic laws are established (largely) to create safety in the movement of people and all the modes of transportation, but accidents still happen.
I'm not an expert on this topic, so I won't make any claims about these three laws and their impact on safety, but largely I would say they're encouraging people to think critically. I'd say that's a good suggestion for interacting with just about anything. And to be clear, "critical thinking" to me means being skeptical (/ actively questioning), while remaining objective and curious.
Not a real argument or anything, but I'm reminded of the episode of The Office where Michael Scott listens to the GPS without thinking and drives into the lake. The second law in the article would have prevented that :)
[dead]
The usefulness of an ai agent is that it can do everything you can do, so it's kind of inherently unsafe? you can't get the capabilities and also have safety easily
> Asimov's laws of robotics are flawed too, of course.
Almost all of Asimovs writing about the three laws is written as a warning of sorts that language cannot properly capture intent.
He would be the very first person to say that they are flawed, that is the intent of them.
He uses robots and AI as the creatures that understand language but not intent, and, funnily enough that's exactly what LLMs do... how weird.