logoalt Hacker News

lelandfetoday at 10:08 AM8 repliesview on HN

In chats that run long enough on ChatGPT, you'll see it begin to confuse prompts and responses, and eventually even confuse both for its system prompt. I suspect this sort of problem exists widely in AI.


Replies

insintoday at 10:18 AM

Gemini seems to be an expert in mistaking its own terrible suggestions as written by you, if you keep going instead of pruning the context

show 1 reply
jwrallietoday at 10:37 AM

I think it’s good to play with smaller models to have a grasp of these kind of problems, since they happen more often and are much less subtle.

show 1 reply
throw310822today at 11:14 AM

Makes me wonder if during training LLMs are asked to tell whether they've written something themselves or not. Should be quite easy: ask the LLM to produce many continuations of a prompt, then mix them with many other produced by humans, and then ask the LLM to tell them apart. This should be possible by introspecting on the hidden layers and comparing with the provided continuation. I believe Anthropic has already demonstrated that the models have already partially developed this capability, but should be trivial and useful to train it.

j-bostoday at 11:10 AM

At work where LLM based tooling is being pushed haaard, I'm amazed every day that developers don't know, let alone second nature intuit, this and other emergent behavior of LLMs. But seeing that lack here on hn with an article on the frontpage boggles my mind. The future really is unevenly distributed.

sixhobbitstoday at 10:28 AM

author here, interesting to hear, I generally start a new chat for each interaction so I've never noticed this in the chat interfaces, and only with Claude using claude code, but I guess my sessions there do get much longer, so maybe I'm wrong that it's a harness bug

scotty79today at 12:00 PM

It makes sense. It's all probabilistic and it all gets fuzzy when garbage in context accumulates. User messages or system prompt got through the same network of math as model thinking and responses.

throwaway613746today at 1:26 PM

[dead]