On the Approximability Results for Maximum Agreement Subtree and Maximum Compatible Tree Problems
Résumé
The aim of this paper is to give a complete picture of approximability for two tree consensus problems which are of particular interest in computational biology: Maximum Agreement SubTree (MAST) and Maximum Compatible Tree (MCT). Both problems take as input a label set and a collection of trees whose leaf sets are each bijectively labeled with the label set. Define the size of a tree as the number of its leaves. The well-known MAST problem consists of finding a maximum-sized tree that is topologically embedded in each input tree, under label-preserving embeddings. Its variant MCT is less stringent, as it allows the input trees to be arbitrarily refined. Our results are as follows. We show that MCT is -hard to approximate within bound n1− on rooted trees, where n denotes the size of each input tree; the same approximation lower bound was already known for MAST [J. Jansson, Consensus algorithms for trees and strings, Ph. D. Thesis, Lund University, 2003]. Furthermore, we prove that MCT on two rooted trees is not approximable within bound 2log1−n, unless all problems in are solvable in quasi-polynomial time; the same result was previously established for MAST on three rooted trees [J. Hein, T. Jiang, L. Wang, K. Zhang, On the complexity of comparing evolutionary trees, Discrete Applied Mathematics 71 (1-3) (1996) 153-169] (note that MAST on two trees is solvable in polynomial time [M.A. Steel, T.J. Warnow, Kaikoura tree theorems: Computing the maximum agreement subtree, Information Processing Letters 48 (2) (1993) 77-82]). Let CMAST, resp. CMCT, denote the complement version of MAST, resp. MCT: CMAST, resp. CMCT, consists of finding a tree that is a feasible solution of MAST, resp. MCT, and whose leaf label set excludes a smallest subset of the input labels. The approximation threshold for CMAST, resp. CMCT, on rooted trees is shown to be the same as the approximation threshold for CMAST, resp. CMCT, on unrooted trees; it was already known that both CMAST and CMCT are approximable within ratio three on rooted and unrooted trees [V. Berry, F. Nicolas, Maximum agreement and compatible supertrees, in: S.C. Sahinalp, S. Muthukrishnan, U. Dogrusoz (Eds.), Proceedings of the 15th Annual Symposium on Combinatorial Pattern Matching, CPM'04, in: LNCS, vol. 3109, Springer-Verlag, 2004, pp. 205-219; G. Ganapathy, T.J. Warnow, Approximating the complement of the maximum compatible subset of leaves of k trees, in: K. Jansen, S. Leonardi, V. V. Vazirani (Eds.), Proceedings of the 5th International Workshop on Approximation Algorithms for Combinatorial Optimization, APPROX'02, in: LNCS, vol. 2462, Springer-Verlag, 2002, pp. 122-134]. The latter results are completed by showing that CMAST is -hard on three rooted trees and that CMCT is -hard on two rooted trees.