SPACER: Identification of cis-regulatory elements with non-contiguous critical residues

Arijit Chakravarty, Jonathan M. Carlson, Radhika S. Khetani, Charles E. DeZiel and Robert H. Gross

Abstract

Motivation: Many transcription factors bind to sites that are long and loosely related to each other. De novo identification of such motifs is computationally challenging. In this paper, we propose a novel semi-greedy algorithm over the space of all IUPAC degenerate strings to identify the most over-represented highly degenerate motifs.

Results: We present an implementation of this algorithm, named SPACER (Separated Pattern-based Algorithm for cis-Element Recognition) and demonstrate its effectiveness in identifying gapped and highly degenerate motifs. We compare SPACER’s performance against ten motif finders on 42 experimentally defined regulons from B. subtilis, E. coli and S. cerevisiae. These motif finders cover a wide range of both enumerative and statistical approaches, including programs specifically designed for prokaryotic and gapped motifs.

Supplemental Information

Supplemental information referred to in the paper can be found here.

Download SPACER

System requirements: Java VM 1.4 or later and 50MB free RAM at runtime.

Download your preferred archive. Upon extraction, a single directory named "SPACER" will be created. SPACER can be immediately run using

java -jar SPACER.jar fasta_file.

See the included README for more information.