Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier
Conference Papers Year : 2018

Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets

Oleksandra Levchenko
Djamel-Edine Edine Yagoubi
  • Function : Author
  • PersonId : 1086292
Reza Akbarinia
Florent Masseglia
Boyan Kolev

Abstract

A growing number of domains (finance, seismology, internet-of-things, etc.) collect massive time series. When the number of series grow to the hundreds of millions or even billions, similarity queries become intractable on a single machine. Further, naive (quadratic) parallelization won't work well. So, we need both efficient indexing and parallelization. We propose a demonstration of Spark-parSketch, a complete solution based on sketches / random projections to efficiently perform both the parallel indexing of large sets of time series and a similarity search on them. Because our method is approximate, we explore the tradeoff between time and precision. A video showing the dynamics of the demonstration can be found by the link http://parsketch.gforge.inria.fr/video/ parSketchdemo_720p.mov.
Fichier principal
Vignette du fichier
CIKM2018.pdf (1.26 Mo) Télécharger le fichier
Origin Files produced by the author(s)
Loading...

Dates and versions

lirmm-01886760 , version 1 (03-10-2018)

Identifiers

Cite

Oleksandra Levchenko, Djamel-Edine Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Boyan Kolev, et al.. Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets. CIKM 2018 - 27th ACM International Conference on Information and Knowledge Management, Oct 2018, Turin, Italy. pp.1951-1954, ⟨10.1145/3269206.3269226⟩. ⟨lirmm-01886760⟩
155 View
522 Download

Altmetric

Share

More