De novo assembly of viral quasispecies using overlap graphs

Abstract : A viral quasispecies, the ensemble of viral strains populating an infected person, can be highly diverse. For optimal assessment of virulence, pathogenesis, and therapy selection, determining the haplotypes of the individual strains can play a key role. As many viruses are subject to high mutation and recombination rates, high-quality reference genomes are often not available at the time of a new disease outbreak. We present SAVAGE, a computational tool for reconstructing individual haplotypes of intra-host virus strains without the need for a high-quality reference genome. SAVAGE makes use of either FM-index–based data structures or ad hoc consensus reference sequence for constructing overlap graphs from patient sample data. In this over- lap graph, nodes represent reads and/or contigs, while edges reflect that two reads/contigs, based on sound statistical considerations, represent identical haplotypic sequence. Following an iterative scheme, a new overlap assembly algorithm that is based on the enumeration of statistically well-calibrated groups of reads/contigs then efficiently reconstructs the individual haplotypes from this overlap graph. In benchmark experiments on simulated and on real deep-coverage data, SAVAGE drastically outperforms generic de novo assemblers as well as the only specialized de novo viral quasispecies assembler available so far. When run on ad hoc consensus reference sequence, SAVAGE performs very favorably in comparison with state-of-the- art reference genome-guided tools. We also apply SAVAGE on two deep-coverage samples of patients infected by the Zika and the hepatitis C virus, respectively, which sheds light on the genetic structures of the respective viral quasispecies.
Type de document :
Article dans une revue
Genome Research, Cold Spring Harbor Laboratory Press, 2017, 27 (5), pp.835-848. 〈10.1101/gr.215038.116〉
Liste complète des métadonnées

Littérature citée [34 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01693168
Contributeur : Isabelle Gouat <>
Soumis le : jeudi 25 janvier 2018 - 19:52:56
Dernière modification le : jeudi 24 mai 2018 - 15:59:22
Document(s) archivé(s) le : vendredi 25 mai 2018 - 02:01:50

Fichier

835.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale 4.0 International License

Identifiants

Collections

Citation

Jasmijn Baaijens, Amal Zine El Aabidine, Eric Rivals, Alexander Schönhuth. De novo assembly of viral quasispecies using overlap graphs. Genome Research, Cold Spring Harbor Laboratory Press, 2017, 27 (5), pp.835-848. 〈10.1101/gr.215038.116〉. 〈lirmm-01693168〉

Partager

Métriques

Consultations de la notice

348

Téléchargements de fichiers

87