Generalized Bootstrap Supports for Phylogenetic Analyses of Protein Sequences Incorporating Alignment Uncertainty

Abstract : Phylogenetic reconstructions are essential in genomics data analyses and depend on accurate multiple sequence alignment (MSA) models. We show that all currently available large-scale progressive multiple alignment methods are numerically unstable when dealing with amino-acid sequences. They produce significantly different output when changing sequence input order. We used the HOMFAM protein sequences dataset to show that on datasets larger than 100 sequences, this instability affects on average 21.5% of the aligned residues. The resulting Maximum Likelihood (ML) trees estimated from these MSAs are equally unstable with over 38% of the branches being sensitive to the sequence input order. We established that about two-thirds of this uncertainty stems from the unordered nature of children nodes within the guide trees used to estimate MSAs. To quantify this uncertainty we developed unistrap, a novel approach that estimates the combined effect of alignment uncertainty and site sampling on phylogenetic tree branch supports. Compared with the regular bootstrap procedure, unistrap provides branch support estimates that take into account a larger fraction of the parameters impacting tree instability when processing datasets containing a large number of sequences.
Keywords : Bootstrap analysis
Document type :
Journal articles
Complete list of metadatas

https://hal-lirmm.ccsd.cnrs.fr/lirmm-02078444
Contributor : Laurent Brehelin <>
Submitted on : Monday, March 25, 2019 - 12:45:29 PM
Last modification on : Tuesday, March 26, 2019 - 1:18:12 AM

Identifiers

Collections

Citation

Maria Chatzou, Evan Floden, Paolo Di Tommaso, Olivier Gascuel, Cedric Notredame, et al.. Generalized Bootstrap Supports for Phylogenetic Analyses of Protein Sequences Incorporating Alignment Uncertainty. Systematic Biology, Oxford University Press (OUP), 2018, 67 (6), pp.997-1009. ⟨10.1093/sysbio/syx096⟩. ⟨lirmm-02078444⟩

Share

Metrics

Record views

23