If you would like to isolate all short-period repeats
(microsatellites) in a sequence, perhaps the simplest tool to use is
xnun (eXclude Non-Unique Nucleotides). This is a tool designed to
exclude microsatellite sequences when performing database options, but
it can be used to identify such sequences as well.
J.-M. Claverie and D.J. States (1993). Computers in Chemistry 17:
191-201. xnun is available from: ftp://ibc.wustl.edu/~agarwal/xnun.tar.gz
or http://www.ibc.wustl.edu/sensei/
The code for xnun is also incorporated into SENSEI, which will produce
summaries of the microsatellites. I recommend using SENSEI with the
following options for summarizing all the microsatellites of period 1
or 2.
sensei <sequence.fasta> self -hmode 1 -output Hx -scut 16 -xper 2
-xper determines the period, and -scut determines the minimum score in
bits for a microsatellite to be reported.
We are in the process of putting up a web page to perform these
searches on the web. For more information on SENSEI please look up
http://www.ibc.wustl.edu/sensei/ or D.J. States and P. Agarwal
(1996). Compact Encoding Strategies for Molecular Sequence Similarity
Search, Fourth International Conference on Intelligent Systems for
Molecular Biology, 4:211-217, AAAI Press, Menlo Park, CA 1996.
Pankaj
--
Pankaj Agarwal
agarwal at ibc.wustl.eduhttp://www.ibc.wustl.edu/~agarwal
Center for Computational Biology, Institute for Biomedical Computing
Washington University, Box 8036, 700 S. Euclid Avenue, St. Louis, MO 63110