RedOak: a reference-free and alignment-free structure for indexing a collection of similar genomes - LIRMM - Laboratoire d’Informatique, de Robotique et de Microélectronique de Montpellier Access content directly
Preprints, Working Papers, ... Year : 2021

RedOak: a reference-free and alignment-free structure for indexing a collection of similar genomes

Abstract

Background: As the cost of DNA sequencing decreases, high-throughput sequencing technologies become increasingly accessible to many laboratories. Consequently, new issues emerge that require new algorithms, including tools for indexing and compressing hundred to thousands of complete genomes. Results: This paper presents RedOak, a reference-free and alignment-free software package that allows for the indexing of a large collection of similar genomes. RedOak can also be applied to reads from unassembled genomes, and it provides a nucleotide sequence query function. This software is based on a k-mer approach and has been developed to be heavily parallelized and distributed on several nodes of a cluster. The source code of our RedOak algorithm is available at https://gitlab.info-ufr.univ-montp2.fr/DoccY/RedOak. Conclusions: RedOak may be really useful for biologists and bioinformaticians expecting to extract information from large sequence datasets.
Fichier principal
Vignette du fichier
2020.12.19.423583v1.full.pdf (1.12 Mo) Télécharger le fichier

Dates and versions

lirmm-03117453 , version 1 (21-01-2021)

Identifiers

Cite

Clément Agret, Annie Chateau, Gaëtan Droc, Gautier Sarah, Alban Mancheron, et al.. RedOak: a reference-free and alignment-free structure for indexing a collection of similar genomes. 2021. ⟨lirmm-03117453⟩
166 View
277 Download

Altmetric

Share

Gmail Facebook X LinkedIn More