Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets

Oleksandra Levchenko 1 Djamel-Edine Yagoubi 1 Reza Akbarinia 1 Florent Masseglia 1 Boyan Kolev 1 Dennis Shasha 2
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : A growing number of domains (finance, seismology, internet-of-things, etc.) collect massive time series. When the number of series grow to the hundreds of millions or even billions, similarity queries become intractable on a single machine. Further, naive (quadratic) parallelization won't work well. So, we need both efficient indexing and parallelization. We propose a demonstration of Spark-parSketch, a complete solution based on sketches / random projections to efficiently perform both the parallel indexing of large sets of time series and a similarity search on them. Because our method is approximate, we explore the tradeoff between time and precision. A video showing the dynamics of the demonstration can be found by the link http://parsketch.gforge.inria.fr/video/ parSketchdemo_720p.mov.
Type de document :
Communication dans un congrès
CIKM: Conference on Information and Knowledge Management, Oct 2018, Turin, Italy. 27th ACM International Conference on Information and Knowledge Management, pp.1951-1954, 2018, 〈10.1145/3269206.3269226〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01886760
Contributeur : Reza Akbarinia <>
Soumis le : mercredi 3 octobre 2018 - 10:58:52
Dernière modification le : lundi 26 novembre 2018 - 21:28:13

Fichier

CIKM2018.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Citation

Oleksandra Levchenko, Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Boyan Kolev, et al.. Spark-parSketch: A Massively Distributed Indexing of Time Series Datasets. CIKM: Conference on Information and Knowledge Management, Oct 2018, Turin, Italy. 27th ACM International Conference on Information and Knowledge Management, pp.1951-1954, 2018, 〈10.1145/3269206.3269226〉. 〈lirmm-01886760〉

Partager

Métriques

Consultations de la notice

35

Téléchargements de fichiers

152