Extending CloudMdsQL with MFR for Big Data Integration - Archive ouverte HAL Access content directly
Conference Papers Year : 2016

Extending CloudMdsQL with MFR for Big Data Integration

(1) , (1) , (1) , (1)
1
Carlyna Bondiombouy
Boyan Kolev
Patrick Valduriez
Oleksandra Levchenko

Abstract

In this short paper (see [2] for the long version), we propose a functional SQL-like query language (based on Cloud-MdsQL) and query engine to retrieve data from two different kinds of data stores - an RDBMS and a distributed data processing framework such as Apache Spark or Hadoop MapReduce on top of HDFS - and combine them by applying data integration operators (mostly joins). However, users need to be aware of how data are organized across the data stores, so that they write valid queries. The query therefore contains embedded invocations to the underlying data stores, expressed as subqueries. As our query language is functional, it introduces a tight coupling between data and functions.
Fichier principal
Vignette du fichier
MFR-BDA2016.pdf (178.9 Ko) Télécharger le fichier
Origin : Files produced by the author(s)
Loading...

Dates and versions

lirmm-01409104 , version 1 (05-12-2016)

Licence

Attribution - NonCommercial - NoDerivatives - CC BY 4.0

Identifiers

  • HAL Id : lirmm-01409104 , version 1

Cite

Carlyna Bondiombouy, Boyan Kolev, Patrick Valduriez, Oleksandra Levchenko. Extending CloudMdsQL with MFR for Big Data Integration. BDA: Gestion de Données — Principes, Technologies et Applications, LIAS / ISAE-ENSMA, Poitiers, Nov 2016, Poitiers, France. ⟨lirmm-01409104⟩
405 View
255 Download

Share

Gmail Facebook Twitter LinkedIn More