logoalt Hacker News

behnamohlast Sunday at 4:17 PM0 repliesview on HN

sure, but this stuff is only obvious post hoc. so many people have tried to "justify" the attention mechanism according to their area of expertise, but none of them came up with it first; ML engineers with ML thinking did.