Skip to Main content Skip to Navigation
Conference papers

Extending CloudMdsQL with MFR for Big Data Integration

Carlyna Bondiombouy 1 Boyan Kolev 1 Patrick Valduriez 1 Oleksandra Levchenko 1
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : In this short paper (see [2] for the long version), we propose a functional SQL-like query language (based on Cloud-MdsQL) and query engine to retrieve data from two different kinds of data stores - an RDBMS and a distributed data processing framework such as Apache Spark or Hadoop MapReduce on top of HDFS - and combine them by applying data integration operators (mostly joins). However, users need to be aware of how data are organized across the data stores, so that they write valid queries. The query therefore contains embedded invocations to the underlying data stores, expressed as subqueries. As our query language is functional, it introduces a tight coupling between data and functions.
Document type :
Conference papers
Complete list of metadatas

Cited literature [5 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01409104
Contributor : Patrick Valduriez <>
Submitted on : Monday, December 5, 2016 - 4:34:19 PM
Last modification on : Monday, May 4, 2020 - 11:40:20 AM
Long-term archiving on: : Monday, March 20, 2017 - 8:33:54 PM

File

MFR-BDA2016.pdf
Files produced by the author(s)

Licence


Distributed under a Creative Commons Attribution - NonCommercial - NoDerivatives 4.0 International License

Identifiers

  • HAL Id : lirmm-01409104, version 1

Citation

Carlyna Bondiombouy, Boyan Kolev, Patrick Valduriez, Oleksandra Levchenko. Extending CloudMdsQL with MFR for Big Data Integration. BDA: Gestion de Données — Principes, Technologies et Applications, LIAS / ISAE-ENSMA, Poitiers, Nov 2016, Poitiers, France. ⟨lirmm-01409104⟩

Share

Metrics

Record views

873

Files downloads

516