Fast and accurate genome-scale identification of DNA-binding sites
Abstract
Motivation: Discovering DNA binding sites in genome sequences is crucial for understanding genomic regulation. Currently available computational tools for finding binding sites with Position Weight Matrices of known motifs are often used in restricted genomic regions because of their long run times. The ever-increasing number of complete genome sequences points to the need for new generations of algorithms capable of processing large amounts of data. Results: Here we present MOTIF, a new algorithm for seeking transcription factor binding sites in whole genome sequences in a few seconds. We propose a web service that enables the users to search for their own matrix or for multiple JASPAR matrices. Beyond its efficacy , the service properly handles undetermined positions within the genome sequence and provides an adequate output listing for each position the matching word and its score. Availability: MOTIF is freely available for use through an interface at http://www. atgc-montpellier.fr/motif. The source code of the stand-alone search method of MOTIF is freely available at https://gite.lirmm.fr/rivals/motif.git. It is written in C++ and tested on Linux platforms.
Origin | Files produced by the author(s) |
---|
Loading...