logoalt Hacker News

kouteiheika04/03/20251 replyview on HN

Last time I tried it I encountered both showstopper bugs (it was completely obviously broken) and subtle correctness bugs (it looked like it was working, but since I'm paranoid I have unit tests for everything and numerically the errors were too big compared to what you'd get with eager attention or Flash Attention), and it was too slow for my taste compared to Flash Attention so I just dropped it. And I wasn't even doing anything super exotic with it.

Maybe it's better now, but I'd still consider using FlexAttention without a corresponding unit test checking its accuracy against an equivalent eager implementation completely irresponsible.


Replies

gessha04/04/2025

What unit tests do you use for nn modules and how do you come up with them?

show 2 replies