Doesn't that have to do with how many bits you allow in the actual calculation in physical real...

actionfromafar • yesterday at 6:00 PM • 1 reply • view on HN

Doesn't that have to do with how many bits you allow in the actual calculation in physical reality?

Replies

Well, for multiplication complexity is defined in terms of on the number of digits/bits digits directly. For attention, complexity is defined on terms of the number of input vectors which are all at fixed precision. I don't understand what happens to the method proposed in the paper at higher precision (since I don't understand the paper), but in reality in doesn't matter since there is no value in anything over float16 for machine learning.

alt Hacker News

Replies