Massively Distributed Time Series Indexing and Querying

Djamel-Edine Yagoubi 1 Reza Akbarinia 1 Florent Masseglia 1 Themis Palpanas 2
1 ZENITH - Scientific Data Management
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier, CRISAM - Inria Sophia Antipolis - Méditerranée
Abstract : Indexing is crucial for many data mining tasks that rely on efficient and effective similarity query processing. Consequently, indexing large volumes of time series, along with high performance similarity query processing, have became topics of high interest. For many applications across diverse domains though, the amount of data to be processed might be intractable for a single machine, making existing centralized indexing solutions inefficient. We propose a parallel indexing solution that gracefully scales to billions of time series (or high-dimensional vectors, in general), and a parallel query processing strategy that, given a batch of queries, efficiently exploits the index. Our experiments, on both synthetic and real world data, illustrate that our index creation algorithm works on 4 billion time series in less than 5 hours, while the state of the art centralized algorithms do not scale and have their limit on 1 billion time series, where they need more than 5 days. Also, our distributed querying algorithm is able to efficiently process millions of queries over collections of billions of time series, thanks to an effective load balancing mechanism.
Document type :
Journal articles
Complete list of metadatas

Cited literature [38 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-02197618
Contributor : Reza Akbarinia <>
Submitted on : Tuesday, July 30, 2019 - 2:58:36 PM
Last modification on : Thursday, August 15, 2019 - 1:09:14 AM

File

DPiSAX_TKDE.pdf
Files produced by the author(s)

Identifiers

Citation

Djamel-Edine Yagoubi, Reza Akbarinia, Florent Masseglia, Themis Palpanas. Massively Distributed Time Series Indexing and Querying. IEEE Transactions on Knowledge and Data Engineering, Institute of Electrical and Electronics Engineers, 2019, pp.1-14. ⟨10.1109/TKDE.2018.2880215⟩. ⟨lirmm-02197618⟩

Share

Metrics

Record views

10

Files downloads

10