logoalt Hacker News

kneyedtoday at 2:30 AM0 repliesview on HN

yes:

> In this experiment, however, the model recognizes the injection before even mentioning the concept, indicating that its recognition took place internally.

https://www.anthropic.com/research/introspection