Reproducible, Accurately Rounded and Efficient BLAS

Chemseddine Chohra 1 Philippe Langlois 1 David Parello 1
1 DALI - Digits, Architectures et Logiciels Informatiques
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, UPVD - Université de Perpignan Via Domitia
Abstract : Numerical reproducibility failures rise in parallel computation because floating-point summation is non-associative. Massively parallel and optimized executions dynamically modify the floating-point operation order. Hence, numerical results may change from one run to another. We propose to ensure reproducibility by extending as far as possible the IEEE-754 correct rounding property to larger operation sequences. We introduce our RARE-BLAS (Reproducible, Accurately Rounded and Efficient BLAS) that benefits from recent accurate and efficient summation algorithms. Solutions for level 1 (asum, dot and nrm2) and level 2 (gemv) routines are presented. Their performance is studied compared to Intel MKL library and other existing reproducible algorithms. For both shared and distributed memory parallel systems, we exhibit an extra-cost of 2× in the worst case scenario, which is satisfying for a wide range of applications. For Intel Xeon Phi accelerator a larger extra-cost (4× to 6×) is observed, which is still helpful at least for debugging and validation steps.
Liste complète des métadonnées

Cited literature [15 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01280324
Contributor : Philippe Langlois <>
Submitted on : Thursday, July 28, 2016 - 1:22:03 PM
Last modification on : Tuesday, February 19, 2019 - 8:28:01 PM

File

REPPAR16.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : lirmm-01280324, version 2

Collections

Citation

Chemseddine Chohra, Philippe Langlois, David Parello. Reproducible, Accurately Rounded and Efficient BLAS. REPPAR: Reproducibility in Parallel Computing, Aug 2016, Grenoble, France. ⟨lirmm-01280324v2⟩

Share

Metrics

Record views

249

Files downloads

506