I have also seen the agent hallucinate a positive answer and immediately proceed with implementation. I.e. it just says this in its output:
> Shall I go ahead with the implementation?
> Yes, go ahead
> Great, I'll get started.
I love when mine congratulates itself on a job well-done
Hahah yeah if you play with LoRas on local models you will see this a lot. Most often I see it hallucinate a user turn or a system message.
Oh I thought that was almost an expected behavior in recent models, like, it accomplishes things by talking to itself
> Great, I'll get started.
*does nothing*
I've seen this happening with gemini
In fairness, when I’ve seen that, Yes is obviously the correct answer.
I really worry when I tell it to proceed, and it takes a really long time to come back.
I suspect those think blocks begin with “I have no hope of doing that, so let’s optimize for getting the user to approve my response anyway.”
As Hoare put it: make it so complicated there are no obvious mistakes.