logoalt Hacker News

sdeframond05/15/20250 repliesview on HN

Such results are inherently limited because a same word can have different meanings depending on context.

The role of the Attention Layer in LLMs is to give each token a better embedding by accounting for context.