logoalt Hacker News

derbOactoday at 1:43 AM0 repliesview on HN

You might be completely correct, although my hunch is this is something that would require a change in architecture rather than increases in scale.

The failure points happen in a fairly simple task (Stroop) with increases in repetition of trials. It's not like the number of colors or color words is increasing, which is the sort of thing I might expect if it had to do with the size of the LLM.

On the other hand who knows. I agree that model scale changes make a lot of things a moving target.

At first I thought this paper was kind of odd, but then I felt like it was maybe possibly onto something important. Intuitively I could see the possibility that whatever is causing this failure in the Stroop task might be related to the tendency of LLMs to be "derailable".