A linear time algorithm for Shortest Cyclic Cover of Strings

Bastien Cazaux 1, 2 Eric Rivals 2, 1
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Merging words according to their overlap yields a superstring. This basic operation allows to infer long strings from a collection of short pieces, as in genome assembly. To capture a maximum of overlaps, the goal is to infer the shortest superstring of a set of input words. The Shortest Cyclic Cover of Strings (SCCS) problem asks, instead of a single linear superstring, for a set of cyclic strings that contain the words as substrings and whose sum of lengths is minimal. SCCS is used as a crucial step in polynomial time approximation algorithms for the notably hard Shortest Superstring problem, but it is solved in cubic time. The cyclic strings are then cut and merged to build a linear superstring. SCCS can also be solved by a greedy algorithm. Here, we propose a linear time algorithm for solving SCCS based on a Eulerian graph that captures all greedy solutions in linear space. Because the graph is Eulerian, this algorithm can also find a greedy solution of SCCS with the least number of cyclic strings. This has implications for solving certain instances of the Shortest linear or cyclic Superstring problems.
Type de document :
Article dans une revue
Journal of Discrete Algorithms, Elsevier, 2016, 37, pp.56 - 67. 〈10.1016/j.jda.2016.05.001〉
Liste complète des métadonnées

Littérature citée [20 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01525995
Contributeur : Eric Rivals <>
Soumis le : lundi 22 mai 2017 - 15:08:22
Dernière modification le : jeudi 11 janvier 2018 - 06:26:13
Document(s) archivé(s) le : mercredi 23 août 2017 - 15:52:43

Fichier

Cazaux-Rivals-JDA-2016.pdf
Publication financée par une institution

Identifiants

Collections

Citation

Bastien Cazaux, Eric Rivals. A linear time algorithm for Shortest Cyclic Cover of Strings. Journal of Discrete Algorithms, Elsevier, 2016, 37, pp.56 - 67. 〈10.1016/j.jda.2016.05.001〉. 〈lirmm-01525995〉

Partager

Métriques

Consultations de la notice

203

Téléchargements de fichiers

63