Do you have evals for this claim? I don't really experience this
quick search:
- https://www.reddit.com/r/ChatGPT/comments/1owob2f/if_you_tel...
- https://www.reddit.com/r/ChatGPT/comments/1lca9mq/chatgpt_is...
If given A and not B llms often just output B after the context window gets large enough.
It's enough of a problem that it's in my private benchmarks for all new models.
quick search:
- https://www.reddit.com/r/ChatGPT/comments/1owob2f/if_you_tel...
- https://www.reddit.com/r/ChatGPT/comments/1lca9mq/chatgpt_is...