Fast and accurate branch lengths estimation for phylogenomic trees

Manuel Binet 1, 2, 3 Olivier Gascuel 1, 2 Celine Scornavacca 1, 3 Emmanuel Douzery 3 Fabio Pardi 1, 2, *
* Auteur correspondant
2 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Background: Branch lengths are an important attribute of phylogenetic trees, providing essential information for many studies in evolutionary biology. Yet, part of the current methodology to reconstruct a phylogeny from genomic information — namely supertree methods — focuses on the topology or structure of the phylogenetic tree, rather than the evolutionary divergences associated to it. Moreover, accurate methods to estimate branch lengths — typically based on probabilistic analysis of a concatenated alignment — are limited by large demands in memory and computing time, and may become impractical when the data sets are too large. Results: Here, we present a novel phylogenomic distance-based method, named ERaBLE (Evolutionary Rates and Branch Length Estimation), to estimate the branch lengths of a given reference topology, and the relative evolutionary rates of the genes employed in the analysis. ERaBLE uses as input data a potentially very large collection of distance matrices, where each matrix is obtained from a different genomic region — either directly from its sequence alignment, or indirectly from a gene tree inferred from the alignment. Our experiments show that ERaBLE is very fast and fairly accurate when compared to other possible approaches for the same tasks. Specifically, it efficiently and accurately deals with large data sets, such as the OrthoMaM v8 database, composed of 6,953 exons from up to 40 mammals. Conclusions: ERaBLE may be used as a complement to supertree methods — or it may provide an efficient alternative to maximum likelihood analysis of concatenated alignments — to estimate branch lengths from phylogenomic data sets.
Type de document :
Article dans une revue
BMC Bioinformatics, BioMed Central, 2016, 17 (23), 〈10.1186/s12859-015-0821-8〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01236485
Contributeur : Fabio Pardi <>
Soumis le : mardi 1 décembre 2015 - 17:31:21
Dernière modification le : vendredi 19 octobre 2018 - 14:22:03
Document(s) archivé(s) le : vendredi 28 avril 2017 - 23:22:55

Fichier

erable_for_hal.pdf
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Manuel Binet, Olivier Gascuel, Celine Scornavacca, Emmanuel Douzery, Fabio Pardi. Fast and accurate branch lengths estimation for phylogenomic trees. BMC Bioinformatics, BioMed Central, 2016, 17 (23), 〈10.1186/s12859-015-0821-8〉. 〈lirmm-01236485〉

Partager

Métriques

Consultations de la notice

424

Téléchargements de fichiers

297