Skip to Main content Skip to Navigation
Conference papers

Building the Assembly De Bruijn Graph from an Implicit Suffix Tree

Eric Rivals 1, 2, * 
* Corresponding author
2 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Suffix trees belong to the most studied indexing data structures for strings. Generalised suffix trees can index the substrings of a set of input words. In its construction algorithm Ukkonen used implicit suffix trees, which relax the assumption that a suffix is not a prefix of another suffix, and thus represent some suffixes by internal nodes rather than by leaves. In computational biology, for the sake of genome assembly, one wishes to assemble a target sequence from a multitude of input strings, usually called the reads. Numerous algorithms build assembly related graphs that represent the overlaps of the input reads: for instance, the overlap graph, or a special version of the De Bruijn graph, in which only k-mers occurring within the reads are represented by nodes. We address the question of algorithms for building the overlap graph and the De Bruijn graph directly from indexing data structures of the read set, and investigate it when the index in an implicit generalised suffix tree. We will present the algorithms for the De Bruijn graph and for its contracted version, and show they run in a time that is linear in the size of the index. Generally, this work establishes algorithmic paths to convert classical indexing data structures in graph representations useful for assembly. Work in collaboration with Bastien Cazaux et Thierry Lecroq.
Complete list of metadata
Contributor : Eric Rivals Connect in order to contact the contributor
Submitted on : Wednesday, January 29, 2014 - 5:29:32 PM
Last modification on : Friday, August 5, 2022 - 3:02:17 PM
Long-term archiving on: : Sunday, April 9, 2017 - 2:43:01 AM


Files produced by the author(s)


  • HAL Id : lirmm-00938959, version 1



Eric Rivals. Building the Assembly De Bruijn Graph from an Implicit Suffix Tree. Mini-Workshop on the Storage, Search and Annotation of Multiple Similar Genomes, Dec 2013, Bielefeld, Germany. ⟨lirmm-00938959⟩



Record views


Files downloads