alt
Hacker News
ode
•
yesterday at 11:31 PM
•
1 reply
•
view on HN
Do we know why?
Replies
hammeiam
•
yesterday at 11:47 PM
Sparse Attention, it's the highlight of this model as per the paper
➕ show 2 replies
Sparse Attention, it's the highlight of this model as per the paper