J. Dean, S. Ghemawat, and M. , Simplified data processing on large clusters, 6th Symposium on Operating System Design and Implementation (OSDI), 2004.

M. Zaharia, M. Chowdhury, T. Das, A. Dave, J. Ma et al., Resilient Distributed Datasets, USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2012.
DOI : 10.1145/2886107.2886110

Y. Kwon, M. Balazinska, B. Howe, and J. A. Rolia, SkewTune, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, p.2012
DOI : 10.1145/2213836.2213840

R. Akbarinia, M. Liroz-gistau, D. Agrawal, and P. Valduriez, An Efficient Solution for Processing Skewed MapReduce Jobs, Database and Expert Systems Applications (DEXA), 2015.
DOI : 10.1007/978-3-319-22852-5_35

URL : https://hal.archives-ouvertes.fr/lirmm-01162359

M. Liroz-gistau, R. Akbarinia, and P. Valduriez, FP-Hadoop, Proceedings of the VLDB Endowment, vol.8, issue.12, pp.1856-1859, 2015.
DOI : 10.14778/2824032.2824085

URL : https://hal.archives-ouvertes.fr/lirmm-01377715

T. White, Hadoop -The Definitive Guide: Storage and Analysis at Internet Scale, 2012.

J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart et al., Data cube: a relational aggregation operator generalizing GROUP-BY, CROSS-TAB, and SUB-TOTALS, Proceedings of the Twelfth International Conference on Data Engineering, pp.29-53, 1997.
DOI : 10.1109/ICDE.1996.492099

S. Brin and L. Page, The anatomy of a large-scale hypertextual Web search engine, Computer Networks and ISDN Systems, vol.30, issue.1-7, pp.107-117, 1998.
DOI : 10.1016/S0169-7552(98)00110-X

K. Lee, Y. Lee, H. Choi, Y. D. Chung, and B. Moon, Parallel data processing with MapReduce, ACM SIGMOD Record, vol.40, issue.4, pp.11-20, 2011.
DOI : 10.1145/2094114.2094118

C. Olston, B. Reed, U. Srivastava, R. Kumar, and A. Tomkins, Pig latin, Proceedings of the 2008 ACM SIGMOD international conference on Management of data , SIGMOD '08, 2008.
DOI : 10.1145/1376616.1376726

A. Floratou, J. M. Patel, E. J. Shekita, and S. Tata, Column-oriented storage techniques for MapReduce, Proceedings of the VLDB Endowment, vol.4, issue.7, pp.419-429, 2011.
DOI : 10.14778/1988776.1988778

Y. Lin, D. Agrawal, C. Chen, B. C. Ooi, and S. Wu, Llama, Proceedings of the 2011 international conference on Management of data, SIGMOD '11, 2011.
DOI : 10.1145/1989323.1989424

F. Chang, J. Dean, S. Ghemawat, W. C. Hsieh, D. A. Wallach et al., Bigtable, ACM Transactions on Computer Systems, vol.26, issue.2, pp.4-26, 2008.
DOI : 10.1145/1365815.1365816

Y. Bu, B. Howe, M. Balazinska, and M. D. Ernst, The HaLoop approach to large-scale iterative data analysis, The VLDB Journal, vol.7, issue.1
DOI : 10.1007/s00778-012-0269-7

J. Dittrich, J. Quiané-ruiz, A. Jindal, Y. Kargin, V. Setty et al., Hadoop++, Proceedings of the VLDB Endowment, vol.3, issue.1-2, pp.518-529, 2010.
DOI : 10.14778/1920841.1920908

I. Elghandour and A. Aboulnaga, ReStore, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, 2012.
DOI : 10.1145/2213836.2213937

J. Huang, R. Zhang, R. Buyya, and J. Chen, MELODY-JOIN: Efficient Earth Mover's Distance similarity joins using MapReduce, 2014 IEEE 30th International Conference on Data Engineering, 2014.
DOI : 10.1109/ICDE.2014.6816702

H. Yang, A. Dasdan, R. Hsiao, and D. S. Parker, Mapreduce-merge: simplified relational data processing on large clusters, 2007.

D. Jiang, A. K. Tung, and G. Chen, MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters, IEEE Transactions on Knowledge and Data Engineering, vol.23, issue.9, pp.1299-1311, 2011.
DOI : 10.1109/TKDE.2010.248

Y. N. Silva and J. M. Reed, Exploiting MapReduce-based similarity joins, Proceedings of the 2012 international conference on Management of Data, SIGMOD '12, 2012.
DOI : 10.1145/2213836.2213935

A. Okcan and M. Riedewald, Processing theta-joins using MapReduce, Proceedings of the 2011 international conference on Management of data, SIGMOD '11, 2011.
DOI : 10.1145/1989323.1989423

D. Deng, G. Li, S. Hao, J. Wang, and J. Feng, MassJoin: A mapreduce-based method for scalable string similarity joins, 2014 IEEE 30th International Conference on Data Engineering, 2014.
DOI : 10.1109/ICDE.2014.6816663

S. Fries, B. Boden, G. Stepien, and T. Seidl, PHiDJ: Parallel similarity self-join for high-dimensional vector data with MapReduce, 2014 IEEE 30th International Conference on Data Engineering, 2014.
DOI : 10.1109/ICDE.2014.6816701

B. Gufler, N. Augsten, A. Reiser, and A. Kemper, Load Balancing in MapReduce Based on Scalable Cardinality Estimates, 2012 IEEE 28th International Conference on Data Engineering, 2012.
DOI : 10.1109/ICDE.2012.58

S. R. Ramakrishnan, G. Swart, and A. Urmanov, Balancing reducer skew in MapReduce workloads using progressive sampling, Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, 2012.
DOI : 10.1145/2391229.2391245

S. Rao, R. Ramakrishnan, A. Silberstein, M. Ovsiannikov, and D. Reeves, Sailfish, Proceedings of the Third ACM Symposium on Cloud Computing, SoCC '12, 2012.
DOI : 10.1145/2391229.2391233

A. Shinnar, D. Cunningham, B. Herta, and V. A. Saraswat, M3R, Proceedings of the VLDB Endowment, vol.5, issue.12, pp.1736-1747, 2012.
DOI : 10.14778/2367502.2367513

K. Elmeleegy, C. Olston, and B. Reed, SpongeFiles, Proceedings of the 2014 ACM SIGMOD international conference on Management of data, SIGMOD '14, 2014.
DOI : 10.1145/2588555.2595634