An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life

Abstract : Previous studies of protein fold space suggest that fold coverage is plateauing. However, sequence sampling has been -and remains to a large extent- heavily biased, focusing on culturable phyla. Sustained technological developments have fuelled the advent of metagenomics and single-cell sequencing, which might correct the current sequencing bias. The extent to which these e orts a ect structural diversity remains unclear, although preliminary results suggest that uncultured organisms could constitute a source of new folds. We investigate to what extent genomes from uncultured and under-sampled phyla accessed through single cell sequencing, metagenomics and high-throughput culturing e orts have the potential to increase protein fold space, and conclude that i) genomes from under-sampled phyla appear enriched in sequences not covered by current protein family and fold pro le libraries, ii) this enrichment is linked to an excess of short (and possibly partly spurious) sequences in some of the datasets, iii) the discovery rate of novel folds among sequences uncovered by current fold and family pro le libraries may be as high as 36%, but would ultimately translate into a marginal increase in global discovery of novel folds. Thus, genomes from under-sampled phyla should have a rather limited impact on increasing coarse grained tertiary structure level novelty.
Type de document :
Article dans une revue
Scientific Reports, Nature Publishing Group, 2015, 5, pp.14717. 〈10.1038/srep14717〉
Liste complète des métadonnées

Littérature citée [30 références]  Voir  Masquer  Télécharger

https://hal-lirmm.ccsd.cnrs.fr/lirmm-01348860
Contributeur : Isabelle Gouat <>
Soumis le : mardi 26 juillet 2016 - 07:27:57
Dernière modification le : jeudi 24 mai 2018 - 15:59:22

Fichier

srep14717.pdf
Fichiers éditeurs autorisés sur une archive ouverte

Identifiants

Collections

Citation

Daniel Barry Roche, Thomas Brüls. An assessment of the amount of untapped fold level novelty in under-sampled areas of the tree of life. Scientific Reports, Nature Publishing Group, 2015, 5, pp.14717. 〈10.1038/srep14717〉. 〈lirmm-01348860〉

Partager

Métriques

Consultations de la notice

308

Téléchargements de fichiers

190