The thing is, the same agent that made the bananas mistake is also quite good at catching that mistake (if called again with fresh context). This results in convergence on working, non-bananas solutions.