logoalt Hacker News

clbrmbryesterday at 11:36 PM0 repliesview on HN

This. The models struggle with differentiating tool responses from user messages.

The trouble is these are language models with only a veneer of RL that gives them awareness of the user turn. They have very little pretraining on this idea of being in the head of a computer with different people and systems talking to you at once. —- there’s more that needs to go on than eliciting a pre-learned persona.