Empirical Evaluation of Clustering Algorithms for Large Networks

Guillaume Artignan 1, * Mountaz Hascoët 1
* Corresponding author
1 LIRMM/HE - Hors Équipe
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Clustering is probably one of the most frequently used approaches when facing a scaling problem in large networks. In many situations, however, the choice of the most appropriate algorithm for clustering can turn into a real dilemma. Numerical criteria have been proposed to evaluate the quality of the results of clustering algorithms. However, so many different criteria have been proposed that the dilemma gets even worse. Most criteria reveal different aspects of the quality of the results and hide others. The aim of this paper is to help with the understanding of clustering and to facilitate the comparison and the choice of clustering algorithm for a given purpose. Our proposal consists of studying both quality evaluation criteria and clustering algorithms. We start by discussing a selected set of representative criteria, and further conduct a case study on a large set of real data, measuring not only the quality of different representative clustering algorithms but also the impact of each criterion on the ranking of the algorithms. By providing empirical results on several large-scale corpus of either inter-related documents or lexical networks, we hope to clarify the field and facilitate designers' choices.
Complete list of metadatas

Cited literature [23 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00648389
Contributor : Guillaume Artignan <>
Submitted on : Monday, December 5, 2011 - 3:42:12 PM
Last modification on : Wednesday, July 24, 2019 - 6:40:07 PM
Long-term archiving on: Tuesday, March 6, 2012 - 2:35:52 AM

File

2011_rr_ag_mh.pdf
Files produced by the author(s)

Identifiers

  • HAL Id : lirmm-00648389, version 1

Citation

Guillaume Artignan, Mountaz Hascoët. Empirical Evaluation of Clustering Algorithms for Large Networks. 2011. ⟨lirmm-00648389⟩

Share

Metrics

Record views

268

Files downloads

367