D. H. Bailey, R. Barrio, and J. M. Borwein, High-precision computation: Mathematical physics and dynamics, Appl. Math. and Comput, vol.218, issue.20, pp.10106-10121, 2012.

K. Bergman, Exascale computing study: Technology challenges in achieving exascale systems, 2008.

N. Whitehead and A. Fit-florea, Precision & performance: Floating point and IEEE 754 compliance for NVIDIA GPUs, NVIDIA, 2011.

M. Corden, Differences in floating-point arithmetic between Intel R Xeon R processors and the Intel R Xeon Phi TM coprocessor, Intel, 2013.

A. Katranov, Deterministic reduction: a new community preview feature in Intel R Threading Building Blocks, Intel, 2012.

K. Doertel, Best known method: Avoid heterogeneous precision in control flow calculations, Intel, 2013.

J. Demmel and H. D. Nguyen, Fast reproducible floating-point summation, Proceedings of the 21st IEEE Symposium on Computer Arithmetic, pp.163-172, 2013.
DOI : 10.1109/arith.2013.9
URL : http://www.eecs.berkeley.edu/~hdnguyen/public/papers/ARITH21_Fast_Sum.pdf

, IEEE Computer Society: IEEE Standard for Floating-Point Arithmetic. IEEE Standard, vol.754, 2008.

N. J. Higham, Accuracy and stability of numerical algorithms, second ed, Society for Industrial and Applied Mathematics, 2002.

J. M. Muller, N. Brisebarre, F. De-dinechin, C. P. Jeannerod, V. Lefèvre et al., Handbook of Floating-Point Arithmetic, 2010.
URL : https://hal.archives-ouvertes.fr/ensl-00379167

D. E. Knuth, The Art of Computer Programming, Seminumerical Algorithms, vol.2, 1997.

X. S. Li, J. W. Demmel, D. H. Bailey, G. Henry, Y. Hida et al., Design, implementation and testing of extended and mixed precision BLAS, ACM Trans. Math. Softw, vol.28, issue.2, pp.152-205, 2002.

Y. Hida, X. S. Li, and D. H. Bailey, Algorithms for quad-double precision floating point arithmetic, Proceedings of the 15th IEEE Symposium on Computer Arithmetic, pp.155-162, 2001.

U. Kulisch and V. Snyder, The Exact Dot Product As Basic Tool for Long Interval Arithmetic, Computing, vol.91, issue.3, pp.307-313, 2011.

R. Shams and R. Kennedy, Efficient histogram algorithms for NVIDIA CUDA compatible devices, Proceedings of the International Conference on Signal Processing and Communications Systems (ICSPCS), pp.418-422, 2007.

G. Bohlender and U. Kulisch, Comments on fast and exact accumulation of products, Applied Parallel and Scientific Computing, vol.7134, pp.148-156, 2012.

D. Defour and F. De-dinechin, Software carry-save for fast multiple-precision algorithms, Proceedings of the 35th International Congress of Mathematical Software, pp.2002-2010, 2002.
URL : https://hal.archives-ouvertes.fr/hal-02102038

S. Boldo and G. Melquiond, Emulation of a FMA and Correctly Rounded Sums: Proved Algorithms Using Rounding to Odd, IEEE Transactions on Computers, vol.57, issue.4, pp.462-471, 2008.
URL : https://hal.archives-ouvertes.fr/inria-00080427

L. Fousse, G. Hanrot, V. Lefèvre, P. Pélissier, and P. Zimmermann, MPFR: A Multiple-precision Binary Floating-point Library with Correct Rounding, ACM Trans. Math. Softw, vol.33, issue.2, p.13, 2007.
URL : https://hal.archives-ouvertes.fr/inria-00070266

. Mpfr-dev-team, The GNU MPFR Library Available via the WWW. Cited, 2014.

J. Reinders, Intel Threading Building Blocks, 2007.

J. Demmel and H. D. Nguyen, Parallel Reproducible Summation. Computers, IEEE Transactions on, vol.64, issue.7, pp.2060-2070, 2015.
DOI : 10.1109/tc.2014.2345391

A. Arteaga, O. Fuhrer, and T. Hoefler, Designing bit-reproducible portable high-performance applications, Proceedings of the 2014 IEEE 28th International Parallel and Distributed Processing Symposium. IPDPS '14, pp.1235-1244, 2014.

J. D. Mccalpin, Is "ordered summation" a hard problem to speed up?, 2012.

J. R. Shewchuk, Robust adaptive floating-point geometric predicates, Proceedings of the twelfth annual symposium on Computational geometry, pp.141-150, 1996.

S. M. Rump, Ultimately fast accurate summation, SIAM J. Scientific Computing, vol.31, issue.5, pp.3466-3502, 2009.

Y. K. Zhu and W. B. Hayes, Algorithm 908: Online Exact Summation of Floating-Point Streams, ACM Trans. Math. Softw, vol.37, issue.3, pp.1-37, 2010.

J. Demmel and H. D. Nguyen, Numerical Reproducibility and Accuracy at ExaScale (invited talk), Proceedings of the 21st IEEE Symposium on Computer Arithmetic, pp.235-237, 2013.

R. M. Neal, Fast exact summation using small and large superaccumulators, 2015.

. Cr-libm, CR-Libm -a library of correctly rounded elementary functions in double-precision, 2007.

R. Iakymchuk, D. Defour, S. Collange, and S. Graillat, Reproducible triangular solvers for high-performance computing, 12th International Conference on Information Technology -New Generations (ITNG), pp.353-358, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01116588

R. Iakymchuk, D. Defour, S. Collange, and S. Graillat, Reproducible and Accurate Matrix Multiplication for GPU Accelerators, 2015.
URL : https://hal.archives-ouvertes.fr/hal-01102877

R. Iakymchuk, S. Collange, D. Defour, and S. Graillat, ExBLAS -Exact BLAS