. Apache, Apache spark programming guide

A. Dean, J. Ghemawat, and S. , MapReduce, Communications of the ACM, vol.51, issue.1, pp.107-113, 2008.
DOI : 10.1145/1327452.1327492

E. Deelman, G. Singh, M. H. Su, J. Blythe, Y. Gil et al., Pegasus: A Framework for Mapping Complex Scientific Workflows onto Distributed Systems, Scientific Programming, vol.13, issue.3, pp.219-237, 2005.
DOI : 10.1155/2005/128026

J. C. Jacob, D. S. Katz, G. B. Berriman, J. C. Good, A. Laity et al., Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking, International Journal of Computational Science and Engineering, vol.4, issue.2, pp.73-87, 2009.
DOI : 10.1504/IJCSE.2009.026999

M. Liroz-gistau, R. Akbarinia, E. Pacitti, F. Porto, and P. Valduriez, Dynamic Workload-Based Partitioning for Large-Scale Databases, Database and Expert Systems Applications, pp.978-981, 2012.
DOI : 10.1007/978-3-642-32597-7_16

URL : https://hal.archives-ouvertes.fr/lirmm-00748549

K. Ocaña and D. De-oliveira, Parallel computing in genomic research: advances and applications. Advances and applications in bioinformatics and chemistry, AABC, vol.8, issue.23, 2015.

D. Oliveira, C. Boeres, F. Porto, and A. Fausti, Avaliação da localidade de dados intermediários na execução paralela de workflows bigdata, SBBD Proceedings, p.2015, 2015.

D. E. De-oliveira, C. Boeres, and F. Porto, Análise de estratégias de acesso a grandes volumes de dados, SBBD Proceedings, p.2014, 2014.

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, Pig latin, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, 2008.
DOI : 10.1145/1376616.1376726

A. Thusoo, J. S. Sarma, N. Jain, Z. Shao, P. Chakka et al., Hive, Proc. VLDB Endow, pp.1626-1629, 2009.
DOI : 10.14778/1687553.1687609

M. Wilde, M. Hategan, J. M. Wozniak, B. Clifford, D. S. Katz et al., Swift: A language for distributed parallel scripting, Parallel Computing, vol.37, issue.9, pp.633-652, 2011.
DOI : 10.1016/j.parco.2011.05.005

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma et al., Resilient Distributed Datasets, Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation, 2012.
DOI : 10.1145/2886107.2886110

J. Zhou, N. Bruno, M. C. Wu, P. A. Larson, R. Chaiken et al., SCOPE: parallel databases meet MapReduce, The VLDB Journal, vol.53, issue.1, pp.611-636, 2012.
DOI : 10.1145/1247480.1247540