Automatic Identification of Large Collections of Protein-Coding or rRNA Sequences

The number of available genomic sequences is growing very fast, due to the development of massive sequencing techniques. Sequence identification is needed and contributes to the assessment of gene and species evolutionary relationships. Automated bioinformatics tools are thus necessary to carry out these identification operations in an accurate and fast way. We developed HoSeqI (Homologous Sequence Identification), a software environment allowing this kind of automated sequence identification using homologous gene family databases. HoSeqI is accessible through a Web interface (http://pbil.univ-lyon1.fr/software/HoSeqI/) allowing to identify one or several sequences and to visualize resulting alignments and phylogenetic trees. We also implemented another application, MultiHoSeqI, to quickly add a large set of sequences to a family database in order to identify them, to update the database, or to help automatic genome annotation. Lately, we developed an application, ChiSeqI (Chimeric Sequence Identification), to automate the processes of identification of bacterial 16S ribosomal RNA sequences and of detection of chimeric sequences.

Mots clés

Similarity Automatic identification Alignment Phylogeny Chimera

Domaines

Bio-informatique [q-bio.QM] Bio-Informatique, Biologie Systémique [q-bio.QM]

Fichier principal

Arigon_et_al-Biochimie-2008.pdf (406.23 Ko)

Origine	Fichiers produits par l'(les) auteur(s)

Anne-Muriel Arigon : Connectez-vous pour contacter le contributeur

https://hal-lirmm.ccsd.cnrs.fr/lirmm-00366131

Soumis le : jeudi 23 mai 2013-15:27:47

Dernière modification le : samedi 18 mai 2024-03:15:03

Archivage à long terme le : samedi 24 août 2013-02:25:08

Dates et versions

lirmm-00366131 , version 1 (23-05-2013)

Identifiants

HAL Id : lirmm-00366131 , version 1
DOI : 10.1016/j.biochi.2007.08.006

Citer

Anne-Muriel Arigon Chifolleau, Guy Perrière, Manolo Gouy. Automatic Identification of Large Collections of Protein-Coding or rRNA Sequences. Biochimie, 2008, 90 (4), pp.609-614. ⟨10.1016/j.biochi.2007.08.006⟩. ⟨lirmm-00366131⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

CNRS UNIV-LYON1 MAB LIRMM BIOENVIS MIPS UNIV-MONTPELLIER LBBE UDL

297 Consultations

411 Téléchargements