logoalt Hacker News

andes314today at 3:56 PM1 replyview on HN

Linear time attention doesn’t work, by principle. Dead end pursuit. Much great research on more efficient quadratic time inference


Replies

smokeltoday at 4:57 PM

What about n log n?