Conjugate Gibbs Sampling for Bayesian Phylogenetic Models
Résumé
We propose a new Markov Chain Monte Carlo (MCMC) sampling mechanism for Bayesian phylogenetic inference. This method, which we call conjugate Gibbs, relies on analytical conjugacy properties, and is based on an alternation between data augmentation and Gibbs sampling. The data augmentation step consists in sampling a detailed substitution history for each site, and across the whole tree, given the current value of the model parameters. Provided convenient priors are used, the parameters of the model can then be directly updated by a Gibbs sampling procedure, conditional on the current substitution history. Alternating between these two sampling steps yields a MCMC device whose equilibrium distribution is the posterior probability density of interest. We show, on real examples, that this conjugate Gibbs method leads to a significant improvement of the mixing behavior of the MCMC. In all cases, the decorrelation times of the resulting chains are smaller than those obtained by standard Metropolis Hastings procedures by at least one order of magnitude. The method is particularly well suited to heterogeneous models, i.e. assuming site-specific random variables. In particular, the conjugate Gibbs formalism allows one to propose efficient implementations of complex models, for instance assuming site-specific substitution processes, that would not be accessible to standard MCMC methods.