A new tool for non-hybrid correction of long noisy reads
Abstract
Nowadays, several long read sequencing technologies are available and long reads are theoretically advantageous not only to assemble a genome, but also to investigate the linkage of genetic variations or the diversity of transcriptomes. However, current levels of sequencing errors hamper the use of long reads. For instance, even the simple task of read alignment on a reference sequence becomes less reliable, more complex and time consuming than with short reads. Several hybrid error correction methods, such as LoRDEC were recently proposed: they require and take advantage of short reads to correct long reads. Here, we present a non hybrid error correction method, which only uses long reads. This method, embodied in a software called LoRMA, relies on LoRDEC to iteratively correct long reads using several De Bruijn graphs of increasing order. Then it performs multiple alignment to compute a long read consensus. LoRMA was tested on bacterial and yeast datasets and provides reliable correction in reasonable computing times.
Domains
Bioinformatics [q-bio.QM]Origin | Files produced by the author(s) |
---|