Approximate Likelihood-Ratio Test for Branches: A Fast, Accurate, and Powerful Alternative
Abstract
We revisit statistical tests for branches of evolutionary trees reconstructed upon molecular data. A new fast, approximate likelihood-ratio test (aLRT) for branches is presented here as a competitive alternative to non-parametric bootstrap and Bayesian estimation of branch support. The aLRT is based on the idea of the conventional LRT, with the null hypothesis corresponding to the assumption that the inferred branch has length 0. We show that the LRT statistic is asymptotically distributed as a maximum of three random variables drawn from the a mixture of Chi2 distribution. The new aLRT of interior branch uses this distribution for significance testing, but the test statistic is approximated in a slightly conservative but practical way as 2(l1 - l2), i.e. double the difference between the maximum log-likelihood values corresponding to the best tree and the second best topological arrangement around the branch of interest. Such test is fast because the log-likelihood value l2 is computed by optimizing only over the branch of interest and the four adjacent branches, while other parameters are fixed at their optimal values corresponding to the best ML tree. The aLRT is implemented within the algorithm used by the recent fast maximum likelihood tree-estimation program PHYML (Guindon and Gascuel, 2003). The performance of the new test was studied on simulated 4-, 12-, and 100-taxa datasets with sequences of different lengths. The aLRT is shown to be accurate, powerful, and robust to certain violations of model assumptions.
Origin | Publisher files allowed on an open archive |
---|
Loading...