Skip to Main content Skip to Navigation
Journal articles

Producing Genomic Sequences after Genome Scaffolding with Ambiguous Paths: Complexity, Approximation and Lower Bounds

Tom Davot 1 Annie Chateau 1 Rodolphe Giroudeau 2 Mathias Weller 3 Dorine Tabary 4
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
2 MAORE - Méthodes Algorithmes pour l'Ordonnancement et les Réseaux
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Scaffolding is the final step in assembling Next Generation Sequencing data, in which pre-assembled contiguous regions ("contigs") are oriented and ordered using information that links them (for example, mapping of paired-end reads). As the genome of some species is highly repetitive, we allow placing some contigs multiple times, thereby generalizing established computational models for this problem. We study the subsequent problems induced by the translation of solutions of the model back to actual sequences, proposing models and analyzing the complexity of the resulting computational problems. We find both polynomialtime and N P-hard special cases like planarity or bounded degree. Finally, we propose two polynomial-time approximation algorithms according to cut/weight score.
Complete list of metadata

https://hal-lirmm.ccsd.cnrs.fr/lirmm-03218029
Contributor : Tom Davot <>
Submitted on : Wednesday, May 5, 2021 - 11:50:24 AM
Last modification on : Monday, July 19, 2021 - 11:04:36 AM

File

main_llncs.pdf
Files produced by the author(s)

Identifiers

Citation

Tom Davot, Annie Chateau, Rodolphe Giroudeau, Mathias Weller, Dorine Tabary. Producing Genomic Sequences after Genome Scaffolding with Ambiguous Paths: Complexity, Approximation and Lower Bounds. Algorithmica, Springer Verlag, 2021, ⟨10.1007/s00453-021-00819-6⟩. ⟨lirmm-03218029v1⟩

Share

Metrics

Record views

36

Files downloads

37