Dan S. Prestridge danp at biosci.cbs.umn.edu
Thu Oct 15 19:35:58 EST 1998

PROMOTER SCAN 2.04 is now available.

PROMOTER SCAN  is a program designed to identify and characterize
mammalian pol II
promoter sequences.  The current version will recognize about 50% of pol
II promoters
that have not been included in, or highly homologous to those in its
training set; with
a rate of about 1 false positive promoter in every 15,000 bases
(single-stranded). By default,
the program searches both strands.  It can recognize both TATA and
TATA-less promoters,
but is better with TATA promoters.

In cross-validation tests, and in a test of new promoters in the latest
version of the EPD,
PROMOTER SCAN was found to recognize 50% of promoters never before seen
by the
program. It, however, does poorly on the Fickett/ Hatzigeorgiou promoter
test set.  There
may be a few reasons for this, including 1) too small and narrow of a
test set, 2) some sequences do
not contain full promoter sequences (one contains only about 25 bp of
promoter sequence,
not a full promoter by any definition), 3) not all promoters were
mammalian, and
4) the set contains sequence information far 5' of core promoter
sequences, which
has not been considered in training and is a deficit of PROMOTER SCAN
(to be
corrected in the future).  Real tests will have to wait for the
appearance of well
characterized, full length, multi-gene sequences (note that those we
not - such as the globin locus are poor test candidates, since
regulation is spread across the
entire globin locus, and not in each individual gene promoter).

PROMOTER SCAN also attempts to do some limited functional
characterization of predicted
promoter sequences.  It does this by comparing the pattern of
transcription factor binding sites
with similar patterns in promoters contained in the Eukaryotic Promoter
Database.  Success with
this approach has so far been limited (and largely unmeasurable for
several reasons) and appears
to characterize the functional properties of some classes of promoters
(such as some immune
system promoters), while doing very badly on other promoter classes
(like metal responsive
promoters).  At this time this aspect of the program is highly
experimental, and hopefully will
improve in the future.  This part of the program, in this version, is
more of a proof-of-concept
than a predictive tool. But it's fun.

The program is available for limited free use at:


This program is limited to only 2 runs per day, and limited sequence
length (10,000 bases)
A local copy of the program is available, but only as a web version, to
on a Sun computer server (no other computer platform or even other UNIX
platforms are supported at this time).  You will need someone with
expert knowledge
of web software installation to install it properly.  A version (Sun
UNIX) is also
available to run on the command line using Sun binaries and UNIX shell
this version will run through a list of promoter sequences.  To obtain a
copy of the program, please contact the author (Dan Prestridge) at
danp at sequana.com
The program is free for non-profit use, and there is fee for commercial


Dan S. Prestridge, Ph.D.                              E-mail:
 Senior Scientist
 Telephone: (619) 646-8367
 AxyS Pharmaceuticals                   Fax: (619) 452-6653
 11099 N. Torrey Pines Rd, Suite 160
 La Jolla, CA  92037

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net