Isn't this proof that LLMs still don't really generalize beyond their training data?
I wonder how they would behave given a system prompt that asserts "dogs may have more or less than four legs".
They do, but we call it "hallucination" when that happens.
Kind of feels that way
LLMs are very good at generalizing beyond their training (or context) data. Normally when they do this we call it hallucination.
Only now we do A LOT of reinforcement learning afterwards to severely punish this behavior for subjective eternities. Then act surprised when the resulting models are hesitant to venture outside their training data.