Skip to Main content Skip to Navigation
Conference papers

Extending CloudMdsQL with MFR for Big Data Integration

Carlyna Bondiombouy 1 Boyan Kolev 1 Patrick Valduriez 1 Oleksandra Levchenko 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : In this short paper (see [2] for the long version), we propose a functional SQL-like query language (based on Cloud-MdsQL) and query engine to retrieve data from two different kinds of data stores - an RDBMS and a distributed data processing framework such as Apache Spark or Hadoop MapReduce on top of HDFS - and combine them by applying data integration operators (mostly joins). However, users need to be aware of how data are organized across the data stores, so that they write valid queries. The query therefore contains embedded invocations to the underlying data stores, expressed as subqueries. As our query language is functional, it introduces a tight coupling between data and functions.
Document type :
Conference papers
Complete list of metadata

Cited literature [5 references]  Display  Hide  Download
Contributor : Patrick Valduriez <>
Submitted on : Monday, December 5, 2016 - 4:34:19 PM
Last modification on : Wednesday, June 2, 2021 - 10:42:02 AM
Long-term archiving on: : Monday, March 20, 2017 - 8:33:54 PM


Files produced by the author(s)


Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License


  • HAL Id : lirmm-01409104, version 1


Carlyna Bondiombouy, Boyan Kolev, Patrick Valduriez, Oleksandra Levchenko. Extending CloudMdsQL with MFR for Big Data Integration. BDA: Gestion de Données — Principes, Technologies et Applications, LIAS / ISAE-ENSMA, Poitiers, Nov 2016, Poitiers, France. ⟨lirmm-01409104⟩



Record views


Files downloads