I think therein lies another fun benchmark to show that LLM don't generalize: ask the llm to so...

mentalgear • today at 7:55 PM • 0 replies • view on HN

I think therein lies another fun benchmark to show that LLM don't generalize: ask the llm to solve the same logic riddle, only in different languages. If it can solve it in some languages, but not in others, it's a strong argument for just straightforward memorization and next token prediction vs true generalization capabilities.

alt Hacker News