Skip to Main content Skip to Navigation
Journal articles

Rapid alignment-free phylogenetic identification of metagenomic sequences

Benjamin Linard 1 Krister Swenson 1 Fabio Pardi 1
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Motivation: Taxonomic classification is at the core of environmental DNA analysis. When a phyloge- netic tree can be built as a prior hypothesis to such classification, phylogenetic placement (PP) pro- vides the most informative type of classification because each query sequence is assigned to its putative origin in the tree. This is useful whenever precision is sought (e.g. in diagnostics). However, likelihood-based PP algorithms struggle to scale with the ever-increasing throughput of DNA sequencing. Results: We have developed RAPPAS (Rapid Alignment-free Phylogenetic Placement via Ancestral Sequences) which uses an alignment-free approach, removing the hurdle of query sequence align- ment as a preliminary step to PP. Our approach relies on the precomputation of a database of k-mers that may be present with non-negligible probability in relatives of the reference sequences. The placement is performed by inspecting the stored phylogenetic origins of the k-mers in the query, and their probabilities. The database can be reused for the analysis of several different metagenomes. Experiments show that the first implementation of RAPPAS is already faster than competing likeli- hood-based PP algorithms, while keeping similar accuracy for short reads. RAPPAS scales PP for the era of routine metagenomic diagnostics.
Complete list of metadatas

Cited literature [50 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-02410441
Contributor : Krister Swenson <>
Submitted on : Friday, December 13, 2019 - 6:42:56 PM
Last modification on : Wednesday, May 27, 2020 - 4:02:04 AM

Licence


Distributed under a Creative Commons Attribution - NonCommercial 4.0 International License

Identifiers

Collections

Citation

Benjamin Linard, Krister Swenson, Fabio Pardi. Rapid alignment-free phylogenetic identification of metagenomic sequences. Bioinformatics, Oxford University Press (OUP), 2019, 35 (18), pp.3303-3312. ⟨10.1093/bioinformatics/btz068⟩. ⟨lirmm-02410441⟩

Share

Metrics

Record views

50

Files downloads

88