logoalt Hacker News

charles_ftoday at 5:27 AM0 repliesview on HN

> That was it. The agent had invented a mental state for me and then used that invented state to justify ignoring the rules.

Or: the agent did shit because the context was getting long, instructions lost in compaction, and it defaulted back to garbage code. Then when you asked "why are you cutting corners", it did what LLMs do, and found the next tokens completing the sentence "why do you cut corners", which is possibly "because you're in a hurry".

It would be interesting to see what it answers if you ask "why are you producing such beautiful, intelligently crafter, very good code" next time it spits garbage

> LLM confabulation isn’t alien. It’s inherited: the models train on human text,

I think this also extrapolate one step too far, it's confabulating because not because the training data does so but because it needs to provide an answer to the question, and one that's plausible with that