Full Compressed Affix Tree Representations

Abstract : The Suffix Tree, a crucial and versatile data structure for string analysis of large texts, is often used in pattern matching and in bioinformatics applications. The Affix Tree generalizes the Suffix Tree in that it supports full tree functionalities in both search directions. The bottleneck of Affix Trees is their space requirement for storing the data structure. Here, we discuss existing representations and classify them into two categories: Synchronous and Asynchronous. We design Compressed Affix Tree indexes in both categories and explored how to support all tree operations bidirectionally. This work compares alternative approaches for compress the Affix Tree, measuring their space and time trade-offs for different operations. Moreover, to our knowledge, this is the first work that compares all Compressed Affix Tree implementations, i.e., four different approaches: the Affix Array, the Bidirectional Wavelet Index, and our two new structures, measures their space and time trade-offs, offering a practical benchmark for this structure.
Complete list of metadatas

Cited literature [25 references]  Display  Hide  Download

https://hal-lirmm.ccsd.cnrs.fr/lirmm-02093302
Contributor : Eric Rivals <>
Submitted on : Monday, April 8, 2019 - 5:59:47 PM
Last modification on : Thursday, April 11, 2019 - 1:18:54 AM

File

Canovas-Rivals-dcc-2017.pdf
Files produced by the author(s)

Identifiers

Collections

Citation

Rodrigo Cánovas, Eric Rivals. Full Compressed Affix Tree Representations. DCC: Data Compression Conference, IEEE, Apr 2017, Snowbird, UT, United States. pp.102-111, ⟨10.1109/DCC.2017.39⟩. ⟨lirmm-02093302⟩

Share

Metrics

Record views

244

Files downloads

23