New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0

PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing Nearest Neighbor Interchanges (NNIs) to improve a reasonable starting tree topology. Since the original publication (Guindon and Gascuel, 2003), PhyML has been widely used (>2,300 citations in ISI Web of Science), because of its simplicity and a fair compromise between accuracy and speed. In the meantime research around PhyML has continued, and this article describes the new algorithms and methods implemented in the program. First, we introduce a new algorithm to search the tree space with user-defined intensity, using Subtree Pruning and Regrafting (SPR) topological moves. The parsimony criterion is used here to filter out the least promising topology modifications with respect to the likelihood function. The analysis of a large collection of real nucleotide and amino-acid data sets of various sizes demonstrates the good performance of this method. Second, we describe a new test to assess the support of the data for internal branches of a phylogeny. This approach extends the recently proposed approximate likelihood-ratio test (aLRT) and relies on a non-parametric, Shimodaira-Hasegawa-like procedure. A detailed analysis of real alignments sheds light on the links between this new approach and the more classical non-parametric bootstrap method. Overall, our tests show that the last version (3.0) of PhyML is fast, accurate, stable and ready to use. A web server and binary files are available from http://www.atgc-montpellier.fr/phyml/

Mots clés

phylogenetic software maximum likelihood tree search algorithms NNI andSPR branch testing LRT and aLRT bootstrap analysis

Domaines

Bio-informatique [q-bio.QM] Bio-Informatique, Biologie Systémique [q-bio.QM]

Fichier principal

GuindonEtAlGascuel_SystBiol2010pdf.pdf (1.52 Mo)

Origine	Fichiers éditeurs autorisés sur une archive ouverte

Isabelle Gouat : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00511784

Soumis le : mercredi 5 septembre 2012-10:49:40

Dernière modification le : mercredi 19 juillet 2023-12:39:11

Archivage à long terme le : jeudi 6 décembre 2012-03:50:09

Dates et versions

lirmm-00511784 , version 1 (26-08-2010)

lirmm-00511784 , version 2 (05-09-2012)

Identifiants

HAL Id : lirmm-00511784 , version 2
DOI : 10.1093/sysbio/syq010

Citer

Stéphane Guindon, Jean-François Dufayard, Vincent Lefort, Maria Anisimova, Wim Hordijk, et al.. New Algorithms and Methods to Estimate Maximum-Likelihood Phylogenies: Assessing the Performance of PhyML 3.0. Systematic Biology, 2010, 59 (3), pp.307-321. ⟨10.1093/sysbio/syq010⟩. ⟨lirmm-00511784v2⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS MAB LIRMM MIPS UNIV-MONTPELLIER

2168 Consultations

3369 Téléchargements