logoalt Hacker News

rasz10/03/20241 replyview on HN

Abrash did this in Quake because those divides are _Free_ when intervened with other code. Pentium FPU is pipelined, you can push FDIV, then FXCH to another data and do something else for a while instead of waiting for the result. The price is hand tuned assembly code that works fast only on Intel FPU in 1996. AMD caught up in 1998-99 finally implementing pipelined FDIV and 0 cycle FXCH.

https://www.phatcode.net/res/224/files/html/ch63/63-02.html


Replies

smirutrandola10/03/2024

OMG that link (and its parent) is extremely interesting! Thank for sharing!