logoalt Hacker News

dash2today at 2:14 AM0 repliesview on HN

> it often makes early assumptions and fails to validate them, which can waste a lot of time

Is this baked into how the models are built? A model outputs a bunch of tokens, then reads them back and treats them as the existing "state" which has to be built on. So if the model has earlier said (or acted like) a given assumption is true, then it is going to assume "oh, I said that, it must be the case". Presumably one reason that hacks like "Wait..." exist is to work around this problem.