> It's patently insane to demand that humans alter their behavior to accommo...

godelski • today at 12:39 AM • 0 replies • view on HN

  > It's patently insane to demand that humans alter their behavior to accommodate the foibles of mere machines

I don't think it's insane, we do it all the time. Most tools require training to use properly. Including tools that people use every day and think are intuitive. Use the can opener as an example (I'll leave it for you all to google and then argue in the comments).

The difference here is that this tool is thrust upon us. In that sense I agree with you that the burden of proper usage is pushed onto the user rather than incorporated into the design of the tool. A niche specific tool can have whatever complex training and usage it wants.

But a general access and generally available tool doesn't have the luxury of allowing for inane usage. LLMs and Agents are poorly designed, and at every level of the pipeline. They're so poorly designed that it's incredibly difficult to use them properly and I'll generally agree with you that the rules the author presents aren't going to stick. The LLM is designed to encourage anthropomorphization. Usage highly encourages natural language, which in turn will cause anthropomorphism. The RLHF tuning optimizes human preference which does the same thing as well as envisaged behaviors like deception and manipulation along with truthful answering (those results are not in contention even if they seem so at first glance).

But I also understand the author's motivation. Truth is unless you're going full luddite you're going to be interacting with these machines. Truth is the ones designing them don't give a shit about proper usage, they care more about if humans believe the responses are accurate and meaningful more then they care if the responses are accurate and meaningful[0]. So it's fucked up, but we are in a position where we're effectively forced to deal with this.

So really, I agree with you that this is insane.

> I don't have a proof, but I believe that "AI safety" is inherently impossible, a contradiction of terms

To paraphrase my namesake, there's no axiomatic system that is entirely self consistent.

Though safety and security is rarely about ensuring all edge cases are impossible, but rather bounding. E.g. all passwords are hackable, but the failure mode is bound such that it is effectively impossible to crack, but not technically. (And quantum algorithms do show how some of the assumptions break down with a paradigm shift. What was reasonable before no longer is)

[0] this is part of a larger conversation where the economy is set up such that people who make things are not encouraged to make those things better. I specifically am avoiding the word "product" because the "product" is no longer the thing being built, it's the share holder value. Just like how TV's don't care much about making the physical device better but care much more about their spyware and ads. Or well... just look at Microsoft if you need a few hundred examples

alt Hacker News