Estimation of sequence errors and capacity of genomic annotation in transcriptomic and DNA-protein interaction assays based on next generation sequencers

Abstract : The transcriptome or the interactome at unprecedented depth. These techniques yield short sequence reads that are then mapped on a genome sequence to predict putatively transcribed or protein-interacting regions. We argue that factors such as false locations, sequence errors, and read length impact on the mapping prediction capacity of these short reads. Here we suggest a computational approach to measure those factors and analyse their influence on both transcriptomic and epigenomic assays. This investigation provides new clues on both methodological and biological issues. First, we estimate that 4.6% of reads are affected by SNPs. Second, we show that the nucleotide error probability is low, and it significantly increases with the position in the sequence. Third, by choosing a read length above 19 bp, we practically eliminates the risk of finding irrelevant positions. However, the number of uniquely mapped reads decreases with sequences above 20 bp. Following our procedure, we obtain 0.6% of false positives among genomic locations. Therefore, even rare signatures, if they are mapped on the genome, should identify biologically relevant regions. This indicates that digital transcriptomics may help to characterise the wealth of yet undiscovered, low abundance transcripts.
Type de document :
Article dans une revue
Cellular Oncology, IOS Press, 2009, 31 (2), pp.145-146
Liste complète des métadonnées

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00416012
Contributeur : Nicolas Philippe <>
Soumis le : vendredi 11 septembre 2009 - 16:10:46
Dernière modification le : jeudi 4 octobre 2018 - 10:58:05

Identifiants

  • HAL Id : lirmm-00416012, version 1

Collections

Citation

Nicolas Philippe, Anthony Boureux, Laurent Brehelin, Jorma Tarhio, Thérèse Commes, et al.. Estimation of sequence errors and capacity of genomic annotation in transcriptomic and DNA-protein interaction assays based on next generation sequencers. Cellular Oncology, IOS Press, 2009, 31 (2), pp.145-146. 〈lirmm-00416012〉

Partager

Métriques

Consultations de la notice

371