logoalt Hacker News

mentalgeartoday at 7:55 PM0 repliesview on HN

I think therein lies another fun benchmark to show that LLM don't generalize: ask the llm to solve the same logic riddle, only in different languages. If it can solve it in some languages, but not in others, it's a strong argument for just straightforward memorization and next token prediction vs true generalization capabilities.