Relationship between superstring and compression measures: New insights on the greedy conjecture

Bastien Cazaux 1, 2 Eric Rivals 2, 1
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : A superstring of a set of words is a string that contains each input word as a substring. Given such a set, the Shortest Superstring Problem (SSP) asks for a superstring of minimum length. SSP is an important theoretical problem related to the Asymmetric Travelling Salesman Problem, and also has practical applications in data compression and in bioinformatics. Indeed, it models the question of assembling a genome from a set of sequencing reads. Unfortunately, SSP is known to be NP-hard even on a binary alphabet and also hard to approximate with respect to the superstring length or to the compression achieved by the superstring. Even the variant in which all words share the same length r, called r-SSP, is NP-hard whenever r > 2. Numerous involved approximation algorithms achieve approximation ratio above 2 for the superstring, but remain difficult to implement in practice. In contrast the greedy conjecture asked in 1988 whether a simple greedy algorithm achieves ratio of 2 for SSP. Here, we present a novel approach to bound the superstring approximation ratio with the compression ratio, which, when applied to the greedy algorithm, shows a 2 approximation ratio for 3-SSP, and also that greedy achieves ratios smaller than 2. This leads to a new version of the greedy conjecture.
Type de document :
Article dans une revue
Discrete Applied Mathematics, Elsevier, 2018, 245, pp.59-64. 〈10.1016/j.dam.2017.04.017〉
Liste complète des métadonnées

Littérature citée [18 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01800340
Contributeur : Eric Rivals <>
Soumis le : vendredi 25 mai 2018 - 18:08:09
Dernière modification le : mercredi 10 octobre 2018 - 14:28:13
Document(s) archivé(s) le : dimanche 26 août 2018 - 14:52:33

Fichier

Cazaux-Rivals-DAM-2018.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Licence


Distributed under a Creative Commons Paternité 4.0 International License

Identifiants

Collections

Citation

Bastien Cazaux, Eric Rivals. Relationship between superstring and compression measures: New insights on the greedy conjecture. Discrete Applied Mathematics, Elsevier, 2018, 245, pp.59-64. 〈10.1016/j.dam.2017.04.017〉. 〈lirmm-01800340〉

Partager

Métriques

Consultations de la notice

217

Téléchargements de fichiers

61