Combinatorics of Distance-Based Tree Inference

Fabio Pardi 1, 2 Olivier Gascuel 1, 2, *
* Auteur correspondant
2 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Several popular methods for phylogenetic inference (or hierarchical clustering) are based on a matrix of pairwise distances between taxa (or any kind of objects): the objective is to construct a tree with branch lengths so that the distances between the leaves in that tree are as close as possible to the input distances. If we hold the structure (topology) of the tree fixed, in some relevant cases (e.g. ordinary least squares) the optimal values for the branch lengths can be expressed using simple combinatorial formulae. Here we define a general form for these formulae and show that they all have two desirable properties: first, the common tree reconstruction approaches (least squares, minimum evolution), when used in combination with these formulae, are guaranteed to infer the correct tree when given enough data (consistency); second, the branch lengths of all the simple (nearest neighbor interchange) rearrangements of a tree can be calculated, optimally, in quadratic time in the size of the tree, thus allowing the efficient application of hill climbing heuristics. The study presented here is a continuation of that by Mihaescu and Pachter on branch length estimation (Mihaescu R, Pachter L (2008) Proc Natl Acad Sci USA 105:13206-13211). The focus here is on the inference of the tree itself, and on providing a basis for novel algorithms to reconstruct trees from distances.
Type de document :
Article dans une revue
Proceedings of the National Academy of Sciences of the United States of America , National Academy of Sciences, 2012, 109, pp.16443-16448
Liste complète des métadonnées

Littérature citée [45 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00726361
Contributeur : Fabio Pardi <>
Soumis le : mercredi 13 mars 2013 - 11:19:03
Dernière modification le : jeudi 24 mai 2018 - 15:59:22
Document(s) archivé(s) le : lundi 17 juin 2013 - 12:37:16

Fichier

PNAS_merged_HAL.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-00726361, version 2

Collections

Citation

Fabio Pardi, Olivier Gascuel. Combinatorics of Distance-Based Tree Inference. Proceedings of the National Academy of Sciences of the United States of America , National Academy of Sciences, 2012, 109, pp.16443-16448. 〈lirmm-00726361v2〉

Partager

Métriques

Consultations de la notice

558

Téléchargements de fichiers

332