Evaluation of Clustering Algorithms: a methodology and a case study
Résumé
Clustering is often cited as one of the most efficient ways to face the challenging scaling problem. Thousands of different approaches for clustering have been proposed over the past decades. Hence, the problem of designing appropriate clustering algorithm has been slowly replaced by the problem of choosing one implementation of one given algorithm amongst a large number of choices. However, because of the complexity of the field, choosing the appropriate implementation can rapidly turn into a dilemma. This paper introduces a methodologyfor the evaluation of clustering algorithms based on (1) theoretical complementary quality measures proposed in a unified notation system, (2) empirical studies on original datasets and (3) new technological instruments useful to both run experiments and visually analyze the results. Such a methodology is important not only to facilitate the choice of a clustering algorithm but also to consolidate the validity of the resultsby enabling reproducibility and comparison of experiments. By proposing a methodology with a case study, our aim is to bring to the scene new insights on the evaluation and comparison of clustering approaches that hopefully help clarify the field.
Origine | Fichiers produits par l'(les) auteur(s) |
---|
Loading...