> Whereas for an agent it will happily include details that are not literally in its chain of tho...

cortesoft • yesterday at 7:52 PM • 2 replies • view on HN

> Whereas for an agent it will happily include details that are not literally in its chain of thought as justifications for its decisions.

Humans do this too, ALL THE TIME. We rationalize decisions after we make them, and truly believe that is why we made the decision. We do it for all sorts of reasons, from protecting our ego to simply needing to fill in gaps in our memory.

Honestly, I feel like asking an AI it’s train of thought for a decision is slightly more useful than asking a human (although not much more useful), since an LLM has a better ability to recreate a decision process than a human does (an LLM can choose to perfectly forget new information to recreate a previous decision).

Of course, I don’t think it is super useful for either humans or LLMs. Trying to get the human OR LLM to simply “think better next time” isn’t going to work. You need actual process changes.

This was a rule we always had at my company for any after incident learning reviews: Plan for a world where we are just as stupid tomorrow as we are today. In other words, the action item can’t be “be more careful next time”, because humans forget sometimes (just like LLMs). You will THINK you are being careful, but a detail slips your mind, or you misremember what situation you are in, or you didn’t realize the outside situation changed (e.g. you don’t realize you bumped the keyboard and now you are typing in another console window).

Instead, the safety improvements have to be about guardrails you put up, or mitigations you put in place to prevent disaster the NEXT time you fail to be as careful as you are trying to be.

Because there is always a next time.

Honestly, I think the biggest struggle we are having with LLMs is not knowing when to treat it like a normal computer program and when to treat it like a more human-like intelligence. We run across both issues all the time. We expect it to behave like a human when it doesn’t and then turn around and expect it to behave like a normal computer program when it doesn’t.

This is BRAND NEW territory, and we are going to make so many mistakes while we try to figure it out. We have to expect that if you want to use LLMs for useful things.

Replies

iainmerrick • yesterday at 8:24 PM

Plan for a world where we are just as stupid tomorrow as we are today. In other words, the action item can’t be “be more careful next time”, because humans forget sometimes (just like LLMs).

That’s a great way of putting it, I’ll remember that one (except when I forget...)

➕ show 1 reply

fragmede • yesterday at 7:57 PM

You're right, but having a backup older than computers.

alt Hacker News

Replies