logoalt Hacker News

xmcqdpt2yesterday at 2:11 PM0 repliesview on HN

The paper is great. It really shows how alignement is entirely surface level and not actually deeply ingrained in the models. Really interesting work.