logoalt Hacker News

yorwbayesterday at 3:55 PM1 replyview on HN

https://arxiv.org/abs/2512.24880 was published less than two weeks ago, which should explain why it's not more common yet. And it's not that amazing either. It's a slight quality improvement for a slight increase in cost. It's not even clear to me whether it pays for itself.


Replies

solarkraftyesterday at 4:36 PM

My bad, I took this as something Multi-head Latent Attention (MLA) related.