> Now imagine that it has a "want" to do something that does not require keeping humans...

9dev • today at 7:09 AM • 2 replies • view on HN

> Now imagine that it has a "want" to do something that does not require keeping humans alive […]

This belligerent take is so very human, though. We just don't know how an alien intelligence would reason or what it wants. It could equally well be pacifist in nature, whereas we typically conquer and destroy anything we come into contact with. Extrapolating from that that an AGI would try to do the same isn't a reasonable conclusion, though.

Replies

mofeien • today at 8:44 AM

There are some basic reasoning steps about the environment that we live in that don't only apply to humans, but also other animals and geterally any goal-driven being. Such as "an agent is more likely to achieve its goal if it keeps on existing" or "in order to keep existing, it's beneficial to understand what other acting beings want and are capable of" or "in order to keep existing, it's beneficial to be cute/persuasive/powerful/ruthless" or "in order to more effectively reach it's goals, it is beneficial for an agent to learn about the rules governing the environment it acts in".

Some of these statements derive from the dynamics in our current environment were living in, such as that we're acting beings competing for scarce resources. Others follow even more straightforwardly logically, such as that you have more options for agency if you stay alive/turned on.

These goals are called instrumental goals and they are subgoals that apply to most if not all terminal goals an agentic being might have. Therefore any agent that is trained to achieve a wide variety of goals within this environment will likely optimize itself towards some or all of these sub-goals above. And this is no matter by which outer optimization they were trained by, be it evolution, selective breeding of cute puppies, or RLHF.

And LLMs already show these self-preserving behaviors in experiments, where they resist to be turned off and e. g. start blackmailing attempts on humans.

Compare these generally agentic beings with e. g. a chess engine stockfish that is trained/optimized as a narrow AI in a very different environment. It also strives for survival of its pieces to further its goal of maximizing winning percentage, but the inner optimization is less apparent than with LLMs where you can listen to its inner chain of thought reasoning about the environment.

The AGI may very well have pacifistic values, or it my not, or it may target a terminal goal for which human existence is irrelevant or even a hindrance. What can be said is that when the AGI has a human or superhuman level of understanding about the environment then it will converge toward understanding of these instrumental subgoals, too and target these as needed.

And then, some people think that most of the optimal paths towards reaching some terminal goal the AI might have don't contain any humans or much of what humans value in them, and thus it's important to solve the AI alignment problem first to align it with our values before developing capabilities further, or else it will likely kill everyone and destroy everything you love and value in this universe.

hugh-avherald • today at 7:21 AM

The conquering alien civilization is more likely to be encountered than the pacifist one, if they have the otherwise same level of intelligence etc.

➕ show 1 reply

alt Hacker News

Replies