I think both the literature on interpretability and explorations on internal representations actuall...

joe_the_user • today at 3:42 AM • 0 replies • view on HN

I think both the literature on interpretability and explorations on internal representations actually reinforce the author's conclusion. I think internal representation research tends to nets that deal with a single "model" don't necessary have the same representation and don't necessarily have a single representation.

And doing well on XYZ isn't evidence of a world model in particular. The point that these things aren't always using a world is reinforced by systems being easily confused by extraneous information, even systems as sophisticated as thus that can solve Math Olympiad questions. The literature has said "ad-hoc predictors" for a long time and I don't think much has changed - except things do better on benchmarks.

And, humans too can act without a consistent world model.

alt Hacker News