>>>>> "Vladislav" == Vladislav Grebenyuk <grebenyu at mail.Uni-Mainz.de> writes:
Vladislav> Many thanks to Michael Mitchell and Fred The program
Vladislav> REPRO (Heringa and Argos, 1993) is able to recognize
Vladislav> distant repeats in a single query sequence.
Vladislav> There is also some commercially available software
Vladislav> with a capability of repeats search in a SINGLE
Vladislav> sequence. For example GeneQuest from DNA Star
Vladislav> package. But I have a lot of sequences. And I have to
Vladislav> find repeats, not to mask them To find a repeats (like
Vladislav> SINE, LINE)I have to compare all my sequences to each
Vladislav> other. Old famous PC-Gene is able of database creation
Vladislav> and a homology search. That could be done also in
Vladislav> FASTA. Then it is going to be N-1 runs. I.e. 499 runs
Vladislav> of homology search for my 500 sequences. And what shell
Vladislav> I do than? How can I handle this data? It is also just
Vladislav> a little hard to perform 499 runs of homology search
Vladislav> manually.
Again, assuming DNA repeats...
If you are looking for new (previously uncharacterised) repeats you
could use the miropeats scripts by Jeremy Parsons at the EBI. This
contains a C-shell script which you will display a Postscript diagram
of the repeats within or between different sequences. Another script
will print a report of the repeats as text. The package requires the
icatools programs. Both are available at:
http://corba.ebi.ac.uk/~jparsons/packages/pub/
You will need a C compiler and access to a Unix-style C-shell (and a
Postscript viewer like Ghostview). I've you have access to Linux a box
you have all you need.
I've had good results with them. I would not try to look at all 500
sequences in the Postscript diagram at one time - it will not be
informative!
I don't think it will help you (having 500 sequences) but you could
take a look at Reputer:
http://bibiserv.techfak.uni-bielefeld.de/reputer/
It might still be useful if your individual sequences are very long.
As you may have a lot of runs to do, you should probably write a shell
or Perl script to perform the searches (and possibly to filter the
results).
Keith
--
Keith James -- kdj at sanger.ac.uk -- http://www.sanger.ac.uk/Users/kdj
The Sanger Centre, Wellcome Trust Genome Campus, Hinxton, Cambs CB10 1SA