Reverse engineering of compact suffix trees and links: A novel algorithm

Bastien Cazaux 1, 2 Eric Rivals 2, 1
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Invented in the 70's, the Suffix Tree (ST) is a data structure that indexes all substrings of a text in linear space. Although more space demanding than other indexes, the ST remains an inspiring index likely because it represents substrings in a hierarchical tree structure. Along time, STs have acquired a central position in text algorithmics with myriad of algorithms and applications to for instance motif discovery, biological sequence comparison, or text compres-sion. It is well known that different words can lead to the same suffix tree structure with different labels. Moreover, the properties of STs prevent all tree structures from being STs. Even the suffix links, which play a key role in efficient construction algorithms and many ap-plications, are not sufficient to discriminate the suffix trees of distinct words. The question of recognising which trees can be STs has been raised and termed Reverse Engineering on STs. For the case where a tree is given with potential suffix links, a seminal work provides a linear time solution only for binary alphabets. Here, we also investigate the Reverse Engineering problem on ST with links and exhibit a novel approach and algorithm. Hopefully, this new suffix tree characterisation makes up a valuable step towards a better understanding of suffix tree combinatorics. * This work is supported by ANR Colib'read (ANR-12-BS02-0008) and Défi MASTODONS SePhHaDe from CNRS.
Type de document :
Article dans une revue
Journal of Discrete Algorithms, Elsevier, 2014, StringMasters 2012 & 2013 Special Issue (Volume 1), 28, pp.9-22. 〈10.1016/j.jda.2014.07.002〉
Liste complète des métadonnées

Littérature citée [8 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01082098
Contributeur : Eric Rivals <>
Soumis le : mercredi 12 novembre 2014 - 15:44:42
Dernière modification le : jeudi 11 janvier 2018 - 06:26:13
Document(s) archivé(s) le : vendredi 13 février 2015 - 11:15:21

Fichier

Cazaux-Rivals-JDA-encrypt.pdf
Fichiers produits par l'(les) auteur(s)

Licence


Distributed under a Creative Commons Paternité - Pas d'utilisation commerciale - Pas de modification 4.0 International License

Identifiants

Collections

Citation

Bastien Cazaux, Eric Rivals. Reverse engineering of compact suffix trees and links: A novel algorithm. Journal of Discrete Algorithms, Elsevier, 2014, StringMasters 2012 & 2013 Special Issue (Volume 1), 28, pp.9-22. 〈10.1016/j.jda.2014.07.002〉. 〈lirmm-01082098〉

Partager

Métriques

Consultations de la notice

348

Téléchargements de fichiers

287