Just go and ask ChatGPT or Claude something that can't possibly be in its training set. Make something up. If it is only memorising answers then it will be impossible for it to get the correct result.
A simple nonsense programming task would suffice. For example "write a Python function to erase every character from a string unless either of its adjacent characters are also adjacent to it in the alphabet. The string only contains lowercase a-z"
That task isn't anywhere in its training set so they can't memorise the answer. But I bet ChatGPT and Claude can still do it.
Honestly this is sooooo obvious to anyone that has used these tools, it's really insane that people are still parroting (heh) the "it just memorises" line.
People who say that LLMs memorize stuff are just as clueless who assume that there's any reasoning happening.
They generate statistically plausible answers (to simplify the answer) based on the training set and weights they have.
LLMs don't "memorize" concepts like humans do. They generate output based on token patterns in their training data. So instead of having to be trained on every possible problem, they can still generate output that solves it by referencing the most probable combination of tokens for the specified input tokens. To humans this seems like they're truly solving novel problems, but it's merely a trick of statistics. These tools can reference and generate patterns that no human ever could. This is what makes them useful and powerful, but I would argue not intelligent.