Wade Walke <dwwalke at scripps.edu> writes:
>I'm looking for a program that can accept input of multiple small (10-20
>bp long) DNA sequences (possibly hundreds of different sequences) then
>compare and analyze them for various consensus motifs. These sequences
>will not all fall into a single consensus motif so I cannot use a simple
>alignment program. I what to know if any such program exists and if not
>is there anyone willing to write such a program given the proper
>incentive?
>Wade Walke
>dwwalke at scripps.edu
I have a set of programs for doing what you wish. The programs were
written for UNIX and are located at
ftp://beagle.colorado.edu/pub/Consensus
The file "consensus.readme" describes what I have in the directory.
Currently, the anonymous ftp directory contains the following major
programs:
1) consensus
2) wconsensus
3) patser
4) gmat-inf-gc
The "consensus" program is the current version of the program
described in Stormo and Hartzell (1989, PNAS, 86:1183-1187) and Hertz
et al. (1990, CABIOS, 6:81-92). However, this program has many more
options than the published version. The most major change is that
each sequence may contribute zero or more words to the pattern being
generated, rather than being required to contribute exactly once.
Also, the algorithm is no longer dependent on the order in which the
sequences are presented to the program.
"wconsensus" differs from the "consensus" program in that the user
does not supply the width of the pattern being sought, although the
user must adjust a bias that determines the final width. This program
can allow terminal deletions in case some of your sequences are
missing the ends of your motif.
The "patser" program allows one to score the words of a sequence against
an alignment matrix obtained from the "consensus" or "wconsensus" program.
The "gmat-inf-gc" program can do a crude graphing of the information
content at each position of an alignment obtained with the
"consensus" or "wconsensus" program.
I also have alignment programs that can introduce gaps (insertions and
deletions) into the alignment. I do not currently have these programs
in my anonymous ftp directory, but will supply them upon request.
Jerry Hertz
Gerald Z. Hertz | telephone: (303) 492-1474
Dept. of MCD Biology | fax: (303) 492-7744
University of Colorado | internet: hertz at boulder.colorado.edu
Campus Box 347 |
Boulder, CO 80309-0347 |