logoalt Hacker News

_heimdalltoday at 11:47 AM2 repliesview on HN

There aren't many working on it though, definitely not enough given how many resources are going into building AI.

AI safety at these labs are largely focused on surface level measures and aren't empowered to stop progress of the company. I was surprised when Anthropic initially held Mythos back from the public, but it was always a temporary measure to give controlled access rather than a pause to make meaningful improvements in AI safety.


Replies

jnwatsontoday at 1:17 PM

The only measures we see are the surface-level ones, because those are the only ones that sort of work.

Alignment is a hard, possibly impossible problem. Anthropic's gambit is they luck upon a solution before the paperclip maximizers take over.

show 2 replies
coderatlargetoday at 12:50 PM

i wish Ilya and crew would chime in