Elephant mirror tests existed, so it doesn’t seem all that novel when the word “elephant” could just...

MichaelZuo • last Tuesday at 12:07 AM • 1 reply • view on HN

Elephant mirror tests existed, so it doesn’t seem all that novel when the word “elephant” could just be substituted for the word “LLM”?

Replies

hackinthebochs • last Tuesday at 12:43 AM

The question isn't about universal novelty, but whether the prompt/context is novel enough such that the LLM answering competently demonstrates understanding. The claim of parroting is that the dataset contains a near exact duplicate of any prompt and so the LLM demonstrating what appears to be competence is really just memorization. But if an LLM can generalize from an elephant mirror test to an LLM mirror test in an entirely new context (showing pictures and being asked to describe it), that demonstrates sufficient generalization to "understand" the concept of a mirror test.

➕ show 1 reply

alt Hacker News

Replies