Using Repeated Measurements to Validate Hierarchical Gene Clusters

Laurent Brehelin 1, * Olivier Gascuel 1 Olivier Martin 2
* Auteur correspondant
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Motivation: Hierarchical clustering is a common approach to study protein and gene expression data. This unsupervised technique is used to find clusters of genes or proteins which are expressed in a coordinated manner across a set of conditions. Because of both the biological and technical variability, experimental repetitions are generally performed. In this work, we propose an approach to evaluate the stability of clusters derived from hierarchical clustering by taking repeated measurements into account. Results: The method is based on the bootstrap technique that is used to obtain pseudo-hierarchies of genes from resampled datasets. Based on a fast dynamic programming algorithm, we compare the original hierarchy to the pseudo-hierarchies and assess the stability of the original gene clusters. Then a shuffling procedure can be used to assess the significance of the cluster stabilities. Our approach is illustrated on simulated data and on two microarray datasets. Compared to the standard hierarchical clustering methodology, it allows to point out the dubious and stable clusters, and thus avoids misleading interpretations. Availability: The programs were developed in C and R languages. Supplementary Material and source code are available at address http://www.lirmm.fr/~brehelin/Stability/
Type de document :
Article dans une revue
Bioinformatics, Oxford University Press (OUP), 2008, 24, pp.682-688
Liste complète des métadonnées

Littérature citée [24 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00272116
Contributeur : Laurent Brehelin <>
Soumis le : vendredi 11 avril 2008 - 08:11:43
Dernière modification le : jeudi 24 mai 2018 - 15:59:22
Document(s) archivé(s) le : vendredi 21 mai 2010 - 01:38:24

Identifiants

  • HAL Id : lirmm-00272116, version 1

Collections

Citation

Laurent Brehelin, Olivier Gascuel, Olivier Martin. Using Repeated Measurements to Validate Hierarchical Gene Clusters. Bioinformatics, Oxford University Press (OUP), 2008, 24, pp.682-688. 〈lirmm-00272116〉

Partager

Métriques

Consultations de la notice

214

Téléchargements de fichiers

98