Novel definition and algorithm for chaining fragments with proportional overlaps

Raluca Uricaru 1 Alban Mancheron 1 Eric Rivals 2, 1, *
* Auteur correspondant
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : Chaining fragments is a crucial step in genome alignment. Existing chaining algorithms compute a maximum weighted chain with no overlaps allowed between adjacent fragments. In practice, using local alignments as fragments, instead of MEMs, i.e. Maximal Exact Matches, generates frequent overlaps between fragments, due to combinatorial reasons and biological factors, i.e. variable tandem repeat structures that differ in number of copies between genomic sequences. In this paper, in order to raise this limitation, we formulate a novel definition of a chain, allowing overlaps proportional to the fragments lengths, and exhibit an efficient algorithm for computing such a maximum weighted chain. We tested our algorithm on a dataset composed of 694 genome pairs and accounted for significant improvements in terms of coverage, while keeping the running times below reasonable limits. Moreover, experiments with different ratios of allowed overlaps showed the robustness of the chains with respect to these ratios.
Liste complète des métadonnées

Littérature citée [16 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00834143
Contributeur : Eric Rivals <>
Soumis le : vendredi 14 juin 2013 - 11:55:49
Dernière modification le : jeudi 24 mai 2018 - 15:59:22
Document(s) archivé(s) le : dimanche 15 septembre 2013 - 04:10:50

Fichier

OverlapChaining-author-copy.pd...
Fichiers produits par l'(les) auteur(s)

Identifiants

Collections

Citation

Raluca Uricaru, Alban Mancheron, Eric Rivals. Novel definition and algorithm for chaining fragments with proportional overlaps. Journal of Computational Biology, Mary Ann Liebert, 2011, 18 (9), pp.1141-1154. 〈http://online.liebertpub.com/doi/abs/10.1089/cmb.2011.0126〉. 〈10.1089/cmb.2011.0126〉. 〈lirmm-00834143〉

Partager

Métriques

Consultations de la notice

454

Téléchargements de fichiers

487