IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Pattern Sequences

Shi Fei shi at inf.ethz.ch
Fri May 7 07:22:08 EST 1993



Dear colleagues,

We have designed and implemented an new algorithm which searches a sequence
database (such as the EMBL Data Library, Genbank or the Swiss-Prot) to find all
such sequences in the database that contain a block (a subsequence) which
is very similar to some given pattern sequence in the sense that the 
differences (according to some distance metric, such as the edit distance)
between the block (subsequence) found and the pattern sequence is no more than 
some given constant. 

The algorithm is expected to be very efficient theoretically. But we want to
know if it is efficient and usefull in biological practice too. For this purpose
we need a number of real (DNA, RNA or protein) sequences which can be used
as pattern sequences which will make the database searches meaningful. The 
pattern sequences used for searching a nucleotide sequence database probably 
should be different from those used for searching a protein sequence database? 
Could you please send us or point us to some of such pattern sequences 
(seperately, in order to search nucleotide database and protein database 
separately) ?! We are pure computer scientists. We need your help, particularly
the help from biologists and chemists!

Thank you very much in advance!

--------------------------------------------------------
Shi Fei

Institut fuer Theoretische Informatik
ETH-Zentrum
Ch-8092 Zurich
Switzerland

e-mail: shi at inf.ethz.ch
Fax: 0041-1-262-3973
Phone: 0041-1-254-7403
--------------------------------------------------------------



More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net