> In fact, most of the time models fail to demonstrate introspection—they’re either unaware of their internal states or unable to report on them coherently.
You got the wrong takeaway from your link.
The parent said: "I argue that the model has no access to its thoughts at the time."
This is falsified by that study, showing that on the frontier models generalized introspection does exist. It isn't consistent, but is is provable.
"no access" vs. "limited access"
The parent said: "I argue that the model has no access to its thoughts at the time."
This is falsified by that study, showing that on the frontier models generalized introspection does exist. It isn't consistent, but is is provable.
"no access" vs. "limited access"