Skip to Main content Skip to Navigation
Journal articles

Parallel Query Processing in a Polystore

Pavlos Kranas 1, 2 Boyan Kolev 1 Oleksandra Levchenko 3 Esther Pacitti 3 Patrick Valduriez 3 Ricardo Jiménez-Peris 1 Marta Patiño-Martinez 2
3 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : The blooming of different data stores has made polystores a major topic in the cloud and big data landscape. As the amount of data grows rapidly, it becomes critical to exploit the inherent parallel processing capabilities of underlying data stores and data processing platforms. To fully achieve this, a polystore should: (i) preserve the expressivity of each data store's native query or scripting language and (ii) leverage a distributed architecture to enable parallel data integration, i.e. joins, on top of parallel retrieval of underlying partitioned datasets. In this paper, we address these points by: (i) using the polyglot approach of the CloudMdsQL query language that allows native queries to be expressed as inline scripts and combined with SQL statements for ad-hoc integration and (ii) incorporating the approach within the LeanXcale distributed query engine, thus allowing for native scripts to be processed in parallel at data store shards. In addition, (iii) efficient optimization techniques, such as bind join, can take place to improve the performance of selective joins. We evaluate the performance benefits of exploiting parallelism in combination with high expressivity and optimization through our experimental validation.
Document type :
Journal articles
Complete list of metadata

https://hal-lirmm.ccsd.cnrs.fr/lirmm-03148271
Contributor : Patrick Valduriez <>
Submitted on : Monday, February 22, 2021 - 10:07:31 AM
Last modification on : Wednesday, June 2, 2021 - 10:42:02 AM
Long-term archiving on: : Sunday, May 23, 2021 - 6:17:18 PM

File

PMSQP_v4.4.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Pavlos Kranas, Boyan Kolev, Oleksandra Levchenko, Esther Pacitti, Patrick Valduriez, et al.. Parallel Query Processing in a Polystore. Distributed and Parallel Databases, Springer, In press, pp.39. ⟨10.1007/s10619-021-07322-5⟩. ⟨lirmm-03148271⟩

Share

Metrics

Record views

78

Files downloads

69