LoRDEC: a tool for correcting errors in long sequencing reads

Eric Rivals 1, 2
1 MAB - Méthodes et Algorithmes pour la Bioinformatique
LIRMM - Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier
Abstract : High-throughput DNA/RNA sequencing is a routine experiment in molecular biology and life sciences in general. For instance, it is increasingly used in the hospital as a key procedure of personal-ized medicine. Compared to the second generation, third generation sequencing technologies produce longer reads with comparatively lower throughput and higher error rate. Those errors include substitutions , indels, and they hinder or at least complicate downstream analysis like mapping or de novo assembly. However, these long read data are often used in conjunction with short reads of the 2nd generation. I will present a hybrid strategy for correcting the long reads using the short reads that we introduced last year. Unlike existing error correction tools, ours, called LoRDEC, avoids aligning short reads on long reads, which is computationally intensive. Instead, it takes advantage of a succinct graph to represent the short reads, and compares long reads to paths in the graph. Experiments show that LoRDEC outperforms existing methods in running time and memory while achieving a comparable correction performance. It can correct both Pacific Biosciences and MinION reads from Oxford Nanopore. LoRDEC is available at http://atgc.lirmm.fr/lordec; joint work with L. Salmela and A. Makrini.
Type de document :
Communication dans un congrès
Mosig A.; Rahnenführer J.; Rahmann S.; Eisenacher M. GCB: German Conference on Bioinformatics, Sep 2015, Dortmund, Germany. PeerJ, 2015, German Conference on Bioinformatics 2015 Collection. 〈http://gcb2015.cs.tu-dortmund.de〉
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01224319
Contributeur : Eric Rivals <>
Soumis le : mercredi 4 novembre 2015 - 15:15:15
Dernière modification le : jeudi 11 janvier 2018 - 06:26:13
Document(s) archivé(s) le : vendredi 5 février 2016 - 11:31:35

Fichier

Rivals-GCB-keynote-talk-abs-20...
Fichiers produits par l'(les) auteur(s)

Identifiants

  • HAL Id : lirmm-01224319, version 1

Collections

Citation

Eric Rivals. LoRDEC: a tool for correcting errors in long sequencing reads. Mosig A.; Rahnenführer J.; Rahmann S.; Eisenacher M. GCB: German Conference on Bioinformatics, Sep 2015, Dortmund, Germany. PeerJ, 2015, German Conference on Bioinformatics 2015 Collection. 〈http://gcb2015.cs.tu-dortmund.de〉. 〈lirmm-01224319〉

Partager

Métriques

Consultations de la notice

232

Téléchargements de fichiers

125