T. Allard, G. Hébrail, F. Masseglia, and E. Pacitti, Chiaroscuro: Transparency and Privacy for Massive Personal Time-Series Clustering. SIGMOD Conference, pp.779-794, 2015.

C. Bondiombouy and P. Valduriez, Query processing in multistore systems: an overview, International Journal of Cloud Computing, vol.5, issue.4, pp.309-346, 2016.
DOI : 10.1504/IJCC.2016.080903

URL : https://hal.archives-ouvertes.fr/hal-01289759

R. Bryant, Data-Intensive Scalable Computing for Scientific Applications, Computing in Science & Engineering, vol.13, issue.6, pp.25-33, 2011.
DOI : 10.1109/MCSE.2011.73

J. Camata, V. Silva, P. Valduriez, M. Mattoso, and A. Coutinho, In situ visualization and data analysis for turbidity currents simulation, Computers & Geosciences, vol.110, pp.23-31, 2018.
DOI : 10.1016/j.cageo.2017.09.013

URL : https://hal.archives-ouvertes.fr/lirmm-01620127

R. Campisano, F. Porto, E. Pacitti, F. Masseglia, and E. Ogasawara, Spatial Sequential Pattern Mining for Seismic Data. SBBD Conference, 2016.

R. Campisano, Sequence Mining in Spatial-Time Series (Master Degree Dissertation, CEFET/RJ, 2017.

A. Coutinho, Computational Science and Big Data: Where are We Now? XLDB Workshop, 2014.

T. Critchlow and K. , Kleese van Dam. Data-Intensive Science, 2013.

A. B. Cruz, J. Ferreira, B. Monteiro, R. Coutinho, F. Porto et al., Detecção de Anomalias no Transporte Rodoviário Urbano, Brazilian Symposium on Databases (SBBD), 2017.

V. Dhar, Data science and prediction, Communications of the ACM, vol.56, issue.12, pp.64-73, 2013.
DOI : 10.1145/2500499

M. Ferro, A. R. Mury, and B. Schulze, A proposal to apply inductive logic programming to self-healing problem in grid computing: How will it work?, Concurrency and Computation: Practice and Experience, vol.2, issue.2, pp.2118-2135, 2011.
DOI : 10.1007/BF00114265

G. Fox, J. Qiu, S. Jha, S. Ekanayake, and S. Kamburugamuve, Big Data, Simulations and HPC Convergence, 2016.
DOI : 10.1109/SC.2012.55

D. Gaspar, F. Porto, R. Akbarinia, and E. Pacitti, TARDIS: Optimal Execution of Scientific Workflows in Apache Spark, Int. Conf. on Big Data Analytics and Knowledge Discovery, vol.21, issue.5, pp.74-87, 2017.
DOI : 10.1007/s00778-012-0280-z

URL : https://hal.archives-ouvertes.fr/lirmm-01620060

T. Hey, The Fourth Paradigm ??? Data-Intensive Scientific Discovery, 2009.
DOI : 10.1007/978-3-642-33299-9_1

URL : https://digital.library.unt.edu/ark:/67531/metadc31516/m2/1/high_res_d/4th_paradigm_book_complete_lr.pdf

X. Huang, T. Lu, X. Ding, and N. Gu, Enabling Data Recommendation in Scientific Workflow Based on Provenance, 2013 8th ChinaGrid Annual Conference, 2013.
DOI : 10.1109/ChinaGrid.2013.25

A. Khatibi, F. Porto, J. Rittmeyer, E. Ogasawara, P. Valduriez et al., Pre-processing and Indexing Techniques for Constellation Queries in Big Data. Int. Conf. on Big Data Analytics and Knowledge Discovery (DaWaK), pp.164-172, 2017.

M. Liroz-gistau, R. Akbarinia, E. Pacitti, F. Porto, and P. Valduriez, Dynamic Worlkload-based Partitioning Algorithms for Continuously Growing Databases. Trans on Large-Scale Data and Knowledge- Centered Systems, pp.105-128, 2013.

J. Liu, E. Pacitti, P. Valduriez, D. De-oliveira, and M. Mattoso, Multi-objective scheduling of Scientific Workflows in multisite clouds, Future Generation Computer Systems, vol.63, pp.76-95, 2016.
DOI : 10.1016/j.future.2016.04.014

URL : https://hal.archives-ouvertes.fr/lirmm-01342203

J. Liu, E. Pacitti, P. Valduriez, and M. Mattoso, Scientific Workflow Scheduling with Provenance Data in a Multisite Cloud, Trans. on Large-Scale Data-and Knowledge-Centered Systems (TLDKS), pp.80-112, 2017.
DOI : 10.1109/IPDPS.2007.370305

URL : https://hal.archives-ouvertes.fr/lirmm-01620224

J. Liu, E. N. Lemus, E. Pacitti, F. Porto, and P. Valduriez, Parallel Computation of PDFs on Big Spatial Data Using Spark, 2018.

H. Lustosa, F. Porto, N. Lemus, and P. Valduriez, TARS: Na Extension of the Multi-dimensional Array Model, ER FORUM ? Conceptual Modeling, 2017.

A. Matheus, H. Lustosa, F. Porto, and B. Schulze, Towards In-transit Analysis on Supercomputing Environments, 2018.

A. Mueen, Time series motif discovery: dimensions and applications, Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery, vol.130, issue.2, pp.152-159, 2014.
DOI : 10.1111/j.1570-7458.2008.00812.x

E. Ogasawara, L. C. Martinez, D. De-oliveira, G. Zimbrao, G. L. Pappa et al., Adaptive Normalization: A novel data normalization approach for non-stationary time series, The 2010 International Joint Conference on Neural Networks (IJCNN), pp.1-8, 2010.
DOI : 10.1109/IJCNN.2010.5596746

E. Ogasawara, J. Dias, D. Oliveira, F. Porto, P. Valduriez et al., An Algebraic Approach for Datacentric Scientific Workflows, Proceedings of the VLDB Endowment (PVLDB), pp.1328-1339, 2011.
URL : https://hal.archives-ouvertes.fr/hal-00640431

R. Oldfield, K. Moreland, N. Fabian, and D. Rogers, Evaluation of methods to integrate analysis into a large-scale shock shock physics code, Proceedings of the 28th ACM international conference on Supercomputing, ICS '14, pp.83-92, 2014.
DOI : 10.1145/2597652.2597668

T. Özsu and P. Valduriez, Principles of Distributed Database Systems ? Third Edition, 2011.

E. Pacitti, R. Akbarinia, and M. , El Dick: P2P Techniques for Decentralized Applications. Synthesis Lectures on Data Management, 2012.

L. Pineda-morales, J. Liu, A. Costany, G. Pacitti, P. Antoniu et al., Managing hot metadata for scientific workflows on multisite clouds, 2016 IEEE International Conference on Big Data (Big Data), pp.390-397, 2016.
DOI : 10.1109/BigData.2016.7840628

URL : https://hal.archives-ouvertes.fr/hal-01395715

F. Porto, J. Nobre, E. Ogasawara, P. Valduriez, and D. Shasha, Point Pattern Search in Big Data. Int. Conf. on Scientific and Statistical Database Management (SSDBM), 2018.

L. D. Raedt, Logical and Relational Learning: From ILP to MRDM (Cognitive Technologies, 2008.
DOI : 10.1007/978-3-540-68856-3

M. Servajean, R. Akbarinia, E. Pacitti, and S. Amer-yahia, Profile Diversity for Query Processing using User Recommendations, Information Systems, vol.48, pp.44-63, 2015.
DOI : 10.1016/j.is.2014.09.001

URL : https://hal.archives-ouvertes.fr/lirmm-01079523

V. Silva, J. Leite, J. Camata, D. De-oliveira, A. Coutinho et al., Raw data queries during data-intensive parallel workflow execution, Future Generation Computer Systems, vol.75, pp.402-422, 2017.
DOI : 10.1016/j.future.2017.01.016

URL : https://hal.archives-ouvertes.fr/lirmm-01445219

V. Silva, D. De-oliveira, P. Valduriez, and M. Mattoso, DfAnalyzer: Runtime Dataflow Analysis of Scientific Applications using Provenance, Proceedings of the VLDB Endowment, 2018.
URL : https://hal.archives-ouvertes.fr/lirmm-01867887

D. , S. Jr, A. Paes, E. Pacitti, and D. Oliveira, Data Quality Prediction in Scientific Workflows, 2018.

R. Souza, V. Silva, P. Miranda, A. Lima, P. Valduriez et al., Spark Scalability Analysis in a Scientific Workflow, Best Paper Award, 2017.
URL : https://hal.archives-ouvertes.fr/lirmm-01620161

R. Souza, V. Silva, J. Camata, A. Coutinho, P. Valduriez et al., Tracking of Online Parameter Fine-tuning in Scientific Workflows, Support of Large-Scale Science (WORKS), ACM/IEEE Supercomputing Conference, 2017.
URL : https://hal.archives-ouvertes.fr/lirmm-01620974

P. Valduriez, Data-intensive HPC: opportunities and challenges. Big Data and Extreme-scale computing (BDEC), 2015.
URL : https://hal.archives-ouvertes.fr/lirmm-01184018