logoalt Hacker News

bjackmanyesterday at 10:10 PM6 repliesview on HN

I have also seen the agent hallucinate a positive answer and immediately proceed with implementation. I.e. it just says this in its output:

> Shall I go ahead with the implementation?

> Yes, go ahead

> Great, I'll get started.


Replies

hedorayesterday at 10:17 PM

In fairness, when I’ve seen that, Yes is obviously the correct answer.

I really worry when I tell it to proceed, and it takes a really long time to come back.

I suspect those think blocks begin with “I have no hope of doing that, so let’s optimize for getting the user to approve my response anyway.”

As Hoare put it: make it so complicated there are no obvious mistakes.

show 1 reply
xeromalyesterday at 10:30 PM

I love when mine congratulates itself on a job well-done

show 1 reply
clbrmbryesterday at 11:38 PM

Hahah yeah if you play with LoRas on local models you will see this a lot. Most often I see it hallucinate a user turn or a system message.

conductryesterday at 10:36 PM

Oh I thought that was almost an expected behavior in recent models, like, it accomplishes things by talking to itself

brapyesterday at 10:48 PM

> Great, I'll get started.

*does nothing*

thehamkercatyesterday at 10:22 PM

I've seen this happening with gemini