Massively Distributed Clustering via Dirichlet Process Mixture - Archive ouverte HAL Access content directly
Conference Papers Year : 2021

Massively Distributed Clustering via Dirichlet Process Mixture

(1) , (2) , (2) , (1) , (2)
1
2

Abstract

Dirichlet Process Mixture (DPM) is a model used for multivariate clustering with the advantage of discovering the number of clusters automatically and offering favorable characteristics, but with prohibitive response times, which makes centralized DPM approaches inefficient. We propose a demonstration of two parallel clustering solutions : i) DC-DPM that gracefully scales to millions of data points while remaining DPM compliant, which is the challenge of distributing this process, ii) HD4C that addresses the curse of dimensionality by performing a distributed DPM clustering of high dimensional data such as time series or hyperspectral data.
Fichier principal
Vignette du fichier
ECML_PKDD_2020.pdf (1.52 Mo) Télécharger le fichier
Origin : Files produced by the author(s)

Dates and versions

lirmm-03036910 , version 1 (02-12-2020)

Identifiers

Cite

Khadidja Meguelati, Bénédicte Fontez, Nadine Hilgert, Florent Masseglia, Isabelle Sanchez. Massively Distributed Clustering via Dirichlet Process Mixture. ECML PKDD 2020 - European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, Sep 2020, Ghent / Virtual, Belgium. ⟨10.1007/978-3-030-67670-4_34⟩. ⟨lirmm-03036910⟩
55 View
111 Download

Altmetric

Share

Gmail Facebook Twitter LinkedIn More