CRAC: A multi-purpose program to analyse large read collections
Résumé
The processing of huge collection of sequencing reads obtained with High Throughput Sequencing (HTS) technologies currently demands for their analysis complex processing pipelines and hours or even days of computing. Numerous articles present processing pipelines able to predict for genomic or transcriptomic reads one type of biological events like small mutations, insertions-deletions, rearrangements, or either normal or chimeric splice junctions (in transcriptomic data). We will present, CRAC, a new read analysis program that fulfills multiple purposes in the sense that it can predict simultaneously, in a single analysis step, the different kinds of above-mentioned biological events, including junctions of chimeric RNAs. Thus, CRAC simplifies genomic or transcriptomic read analysis from the user's point of view. It includes sequencing error detection to avoid confusion with true mutations. Moreover, integrating all predictions in a single step improves the sensitivity and specificity of the predictions. We will show that compared to other current solutions CRAC delivers multiple predictions in highly competitive computing times. CRAC constitutes the basis of a HTS analysis service available at the ATGC bioinformatics platform http://www.atgc-montpellier.fr/ngs/.