logoalt Hacker News

pjc50today at 10:48 AM4 repliesview on HN

> The real concern for me is incredibly rich people with no empathy for you or I, having interstitial control of that kind of messaging. See, all of the grok ai tweaks over the past however long.

Indeed. It's always been clear to me that the "AI risk" people are looking in the wrong direction. All the AI risks are human risks, because we haven't solved "human alignment". An AI that's perfectly obedient to humans is still a huge risk when used as a force multiplier by a malevolent human. Any ""safeguards"" can easily be defeated with the Ender's Game approach.


Replies

ben_wtoday at 11:27 AM

More than one danger from any given tech can be true at the same time. Coal plants can produce local smog as well as global warming.

There's certainly some AI risks that are the same as human risks, just as you say.

But even though LLMs have very human failures (IMO because the models anthropomorphise themselves as part of their training, thus leading to the outward behaviours of our emotions and thus emit token sequences such as "I'm sorry" or "how embarrassing!" when they (probably) didn't actually create any internal structure that can have emotions like sorrow and embarrassment), that doesn't generalise to all AI.

Any machine learning system that is given a poor quality fitness function to optimise, will optimise whatever that fitness function actually is, not what it was meant to be: "Literal minded genie" and "rules lawyering" may be well-worn tropes for good reason, likewise work-to-rule as a union tactic, but we've all seen how much more severe computers are at being literal-minded than humans.

throwaway31131today at 8:24 PM

What’s the “Ender’s Game Approach “? I’ve read the book but I’m not sure which part you’re referring to.

show 2 replies
bananaflagtoday at 10:59 AM

I think people who care about superintelligent AI risk don't believe an AI that is subservient to humans is the solution to AI alignment, for exactly the same reasons as you. Stuff like Coherent Extrapolated Volition* (see the paper with this name) which focuses on what all mankind would want if they know more and they were smarter (or something like that) would be a way to go.

*But Yudkowsky ditched CEV years ago, for reasons I don't understand (but I admit I haven't put in the effort to understand).

zahlmantoday at 6:00 PM

>An AI that's perfectly obedient to humans is still a huge risk when used as a force multiplier by a malevolent human.

"Obedient" is anthropomorphizing too much (as there is no volition), but even then, it only matters according to how much agency the bot is extended. So there is also risk from neglectful humans who opt to present BS as fact due to an expectation of receiving fact and a failure to critique the BS.