logoalt Hacker News

brian_r_hallyesterday at 9:30 PM1 replyview on HN

I think it's really scary how agents are hallucinating/doing bad actions, then proceeding to gaslight you about how nothing went wrong.

Then you tell the agent that it deleted your whole company database, it says something like "I'm so sorry, I shouldn't have done that. Won't do that again"

As AGI looms overhead, this thought of agents going "rogue" with nothing really stopping them has caused me some panic.


Replies

Kosticyesterday at 10:02 PM

"I'm sorry" is not gaslighting but an admission of fault it learned from our texts. And if an LLM managed to delete your database, it's time to slow down the vibe train and put up some guard rails.

LLMs are awesome but not without supervision.

show 1 reply