logoalt Hacker News

adrian_btoday at 10:18 AM1 replyview on HN

While implementing double-precision by double-single may be a solution in some cases, the article fails to mention the overflow/underflow problem, which is critical in scientific/technical computing (a.k.a. HPC).

With the method from the article, the exponent range remains the same as in single precision, instead of being increased to that of double precision.

There are a lot of applications for which such an exponent range would cause far too frequent overflows and underflows. This could be avoided by introducing a lot of carefully-chosen scaling factors in all formulae, but this tedious work would remove the main advantage of floating-point arithmetic, i.e. the reason why computations are not done in fixed-point.

The general solution of this problem is to emulate double-precision with 3 numbers, 2 FP32 for the significand and a third number for the exponent, either a FP number or an integer number, depending on which format is more convenient for a given GPU.

This is possible, but it lowers considerably the achievable ratio between emulated FP64 throughput and hardware FP32 throughput, but the ratio is still better than the vendor-enforced 1:64 ratio.

Nevertheless, for now any small business or individual user can achieve a much better performance per dollar for FP64 throughput by buying Intel Battlemage GPUs, which have a 1:8 FP64/FP32 throughput ratio. This is much better than you can achieve by emulating FP64 on NVIDIA or AMD GPUs.

Intel B580 is a small GPU, so it has only a FP64 throughput about equal to a Ryzen 9 9900X and smaller than a Ryzen 9 9950X. However it provides that throughput at a much lower price. Thus if you start with a PC with a 9900X/9950X, you can double or almost double the FP64 throughput for a low additional price with an Intel GPU. Multiple GPUs will proportionally multiply the throughput.

The sad part is that with the current Intel CEO and with NVIDIA being a shareholder of Intel, it is unclear whether Intel will continue to compete in the GPU market, or they will abandon it, leaving us at the mercy of NVIDIA and AMD, which both refuse to provide products with good FP64 support to small businesses and individual users.


Replies

nsajkotoday at 11:15 AM

Yeah, double-word floating-point loses many of the desirable properties of the usual floating-point.