(Paper #75)
The IEEE 754 standard defines bit-by-bit reproducibility of floating-point results. Bit-by-bit reproducibility prohibits optimizations such as reassociation and the use of native operations such as fused multiply-add (FMA), and thus it significantly impairs floating-point performance. Recent network-oriented languages strictly conform to the standard, and thus their numerical computing performance are lower than conventional languages. In this paper, we propose a new software technique, floating-point (FP) speculation, to optimize floating-point performance while preserving bit-by-bit reproducibility of results. We execute the fast unsafe code and the slow verification code in parallel. Unsafe code does not wait for the verification code, and is immediately followed by the subsequent code that uses the probable result from the unsafe code assuming the speculation will succeed. The improvement by FP speculation results from this earlier start of the subsequent code. Unlike other speculation techniques, FP speculation does not require any special instructions or hardware support. Rather, it exploits unused floating-point registers and execution units. Therefore it is generally applicable for processor architectures that have sufficient floating-point resources.
Keywords:
Scientific Computing
Scheduling
Compilers
Programming Languages
Architecture