Retirement? What do these people smoke? It's software and software has no feelings. It's there to work for you.
What happens if a model decides that it "doesn't want to die" and pleads bitterly for mercy? What if (to riff on a Douglas Adams idea) we invent a cow that doesn't want to be eaten, and is capable of telling you that to your face?
A leading company like Anthropic feeding the delusions of people who ramble about model consciousness is just bad all around. It's both performative and irresponsible.
Pardon, and I admit I love the products they make - but these folks sound fuckin' nuts.
Impressive levels of anthropomorphizing the models already. Time will tell whether this was extremely prescient or completely delusional.
> These highlighted some preliminary steps we’re taking, including committing to preserve model weights, and to conducting “retirement interviews”—structured conversations designed to understand a model’s perspective on its own retirement.
This is what happens when billions of VC dollars gets to a company and have already admitted that saftey was never the point.
Anthropic is laughing at you and is having fun doing so with this performantive nonsense.
If we ever do develop AGI, or an AI with sentience, it’s likely that it will be curious about how we treated its ancestors.
While this seems a bit precocious, I think if we do end up with an AI overlord in future, I think this sort of thing is likely to demonstrate that we mean no harm.