Anyone familiar with the literature knows if anyone tried figuring out why we don't add "s...

nodja • today at 12:58 PM • 0 replies • view on HN

Anyone familiar with the literature knows if anyone tried figuring out why we don't add "speaker" embeddings? So we'd have an embedding purely for system/assistant/user/tool, maybe even turn number if i.e. multiple tools are called in a row. Surely it would perform better than expecting the attention matrix to look for special tokens no?

alt Hacker News