From Indexing Data Structures to de Bruijn Graphs

Abstract : New technologies have tremendously increased sequencing throughput com-pared to traditional techniques, thereby complicating DNA assembly. Hence, as-sembly programs resort to de Bruijn graphs (dBG) of k-mers of short reads to compute a set of long contigs, each being a putative segment of the sequenced molecule. Other types of DNA sequence analysis, as well as preprocessing of the reads for assembly, use classical data structures to index all substrings of the reads. It is thus interesting to exhibit algorithms that directly build a dBG of order k from a pre-existing index, and especially a contracted version of the dBG, where non branching paths are condensed into single nodes. Here, we formalise the relation-ship between suffix trees/arrays and dBGs, and exhibit linear time algorithms for constructing the full or contracted dBGs. Finally, we provide hints explaining why this bridge between indexes and dBGs enables to dynamically update the order k of the graph.
Type de document :
Communication dans un congrès
Kulikov, Alexander S.; Kuznetsov, Sergei O.; Pevzner, Pavel. CPM: Combinatorial Pattern Matching, Jun 2014, Moscow, Russia. Springer International Publishing, CPM'2014: 25th Annual Symposium on Combinatorial Pattern Matching, LNCS (8486), pp.89-99, 2014, Combinatorial Pattern Matching. 〈http://www.springer.com/computer/image+processing/book/978-3-319-07565-5〉. 〈10.1007/978-3-319-07566-2_10〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01081429
Contributeur : Eric Rivals <>
Soumis le : vendredi 7 novembre 2014 - 17:47:40
Dernière modification le : vendredi 12 octobre 2018 - 22:12:01
Document(s) archivé(s) le : dimanche 8 février 2015 - 11:11:43

Fichier

cpm-dbg-auteurs.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Pas de modification 4.0 International License

Identifiants

Citation

Bastien Cazaux, Thierry Lecroq, Eric Rivals. From Indexing Data Structures to de Bruijn Graphs. Kulikov, Alexander S.; Kuznetsov, Sergei O.; Pevzner, Pavel. CPM: Combinatorial Pattern Matching, Jun 2014, Moscow, Russia. Springer International Publishing, CPM'2014: 25th Annual Symposium on Combinatorial Pattern Matching, LNCS (8486), pp.89-99, 2014, Combinatorial Pattern Matching. 〈http://www.springer.com/computer/image+processing/book/978-3-319-07565-5〉. 〈10.1007/978-3-319-07566-2_10〉. 〈lirmm-01081429〉

Partager

Métriques

Consultations de la notice

602

Téléchargements de fichiers

309