Level 1 Parallel RTN-BLAS: Implementation and Efficiency Analysis, SCAN'2014. Wurzburg, 2014. ,
URL : https://hal.archives-ouvertes.fr/lirmm-01095172
Reproducible and Accurate Matrix Multiplication in ExBLAS for High-Performance Computing, 2014. ,
A floating-point technique for extending the available precision, Numerische Mathematik, vol.5, issue.3, pp.224-242, 1971. ,
DOI : 10.1007/BF01397083
Fast Reproducible Floating-Point Summation, 2013 IEEE 21st Symposium on Computer Arithmetic, 2013. ,
DOI : 10.1109/ARITH.2013.9
First steps towards more numerical reproducibility, ESAIM: Proceedings and Surveys, vol.45, pp.229-238, 2013. ,
DOI : 10.1051/proc/201445023
Handbook of Floating-Point Arithmetic, 2010. ,
DOI : 10.1007/978-0-8176-4705-6
URL : https://hal.archives-ouvertes.fr/ensl-00379167
Accurate Sum and Dot Product, SIAM Journal on Scientific Computing, vol.26, issue.6, pp.1955-1988, 2005. ,
DOI : 10.1137/030601818
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.2.1547
Intel Threading Building Blocks, 2007. ,
Ultimately Fast Accurate Summation, SIAM Journal on Scientific Computing, vol.31, issue.5, pp.3466-3502, 2009. ,
DOI : 10.1137/080738490
Accurate Floating-Point Summation Part I: Faithful Rounding, SIAM Journal on Scientific Computing, vol.31, issue.1, pp.189-224, 2008. ,
DOI : 10.1137/050645671
URL : http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.519.3738
Numerical reproducibility in the Intel Math Kernel Library, 2012. ,
BLIS: A Framework for Rapidly Instanciating BLAS Functionality, ACM Trans. Math. Software, 2015. ,
A parallel algorithm for accurate dot product, Parallel Computing, vol.34, issue.6-8, pp.6-8, 2008. ,
DOI : 10.1016/j.parco.2008.02.002
Correct Rounding and a Hybrid Approach to Exact Floating-Point Summation, SIAM Journal on Scientific Computing, vol.31, issue.4, pp.2981-3001, 2009. ,
DOI : 10.1137/070710020
Algorithm 908, ACM Transactions on Mathematical Software, vol.37, issue.3, pp.1-3713, 2010. ,
DOI : 10.1145/1824801.1824815