logoalt Hacker News

minimaxiryesterday at 10:02 PM1 replyview on HN

Golden Gate Claude was two years ago and it's surprising there hasn't been as much research into targeted activations since.


Replies

landl0rdtoday at 1:07 AM

There’s been some, but naive activation steering makes models dumber pretty reliably and training an SAE is a pretty heavy lift.