Visual Analysis of Clustering Algorithms A Methodology and a Case Study

Guillaume Artignan; Mountaz Hascoët

Rapport Année : 2011

Visual Analysis of Clustering Algorithms A Methodology and a Case Study

(1) , (1)

Guillaume Artignan

Fonction : Auteur
PersonId : 862749

Hors Équipe

Mountaz Hascoët

Fonction : Auteur
PersonId : 837916

Hors Équipe

Résumé

Clustering is probably one of the most frequently used ap- proaches when facing a scaling problem in large collections of documents. In many situations, however, the choice of the most appropriate algo- rithm for clustering can turn into a real dilemma. Numerical criteria have been proposed to evaluate the quality of the results of clustering algorithms. However, so many different criteria have been proposed that the dilemma even worsens. Most criteria reveal different aspects of the quality of the results and hide others. The aim of this paper is to help with the understanding of clustering and to facilitate the comparison and the choice of clustering algorithm for a given purpose. Our proposal consists in studying both quality evaluation criteria and clustering algo- rithms. We start by discussing a selected set of representative criteria, and further conduct a case study on a large set of real data, measuring not only the quality of different representative clustering algorithms but also the impact of each criterion on the ranking of the algorithms. By providing empirical results on large scale corpus of either documents or lexical networks useful to digital library, we hope to clarify the field and facilitate designers' choices.

Domaines

Interface homme-machine [cs.HC] Algorithme et structure de données [cs.DS] Mathématique discrète [cs.DM]

Fichier principal

icadl2011_submission_101.pdf (2.54 Mo)

Origine	Fichiers produits par l'(les) auteur(s)

Guillaume Artignan : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00585390

Soumis le : mercredi 20 juillet 2011-09:41:16

Dernière modification le : vendredi 24 mars 2023-14:52:54

Archivage à long terme le : vendredi 21 octobre 2011-02:20:50

Dates et versions

lirmm-00585390 , version 1 (12-04-2011)

lirmm-00585390 , version 2 (20-07-2011)

Identifiants

HAL Id : lirmm-00585390 , version 2

Citer

Guillaume Artignan, Mountaz Hascoët. Visual Analysis of Clustering Algorithms A Methodology and a Case Study. RR-11015, 2011. ⟨lirmm-00585390v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS LIRMM HORSEQUIPE TDS-MACS LARA MIPS UNIV-MONTPELLIER

146 Consultations

169 Téléchargements

Visual Analysis of Clustering Algorithms A Methodology and a Case Study

Résumé

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager