logoalt Hacker News

MichaelZuolast Tuesday at 12:07 AM1 replyview on HN

Elephant mirror tests existed, so it doesn’t seem all that novel when the word “elephant” could just be substituted for the word “LLM”?


Replies

hackinthebochslast Tuesday at 12:43 AM

The question isn't about universal novelty, but whether the prompt/context is novel enough such that the LLM answering competently demonstrates understanding. The claim of parroting is that the dataset contains a near exact duplicate of any prompt and so the LLM demonstrating what appears to be competence is really just memorization. But if an LLM can generalize from an elephant mirror test to an LLM mirror test in an entirely new context (showing pictures and being asked to describe it), that demonstrates sufficient generalization to "understand" the concept of a mirror test.

show 1 reply