logoalt Hacker News

Const-metoday at 6:06 PM1 replyview on HN

> AVX2 level includes FMA (fast multiply-add)

FMA acronym is not fast multiply add, it’s fused multiply add. Fused means the instruction computes the entire a * b + c expression using twice as many mantissa bits, only then rounds the number to the precision of the arguments.

It might be the Prism emulator failed to translate FMA instructions into a pair of two FMLA instructions (equally fused ARM64 equivalent), instead it did some emulation of that fused behaviour, which in turn what degraded the performance of the AVX2 emulation.


Replies

vintagedavetoday at 6:14 PM

Author here - thanks - my bad. Fixed 'fast' -> 'fused' :)

I don't have insight into how Prism works, but I have wondered if the right debugger would see the ARM code and let us debug exactly what was going on for sure.

show 1 reply