Dear all,Evaluating low identity scores

James McInerney jamm at nhm.ac.uk
Mon Jan 12 12:47:19 EST 1998


What you propose is very reasonable.  It has already been implemented in the
fasta package of programs.  There are a number of programs in that package
that perform randomisation tests of one kind or another.  You can get the
package from:


        ftp://ftp.virginia.edu/pub/fasta/fasta20u41.shar.Z      (unix)

        ftp://ftp.virginia.edu/pub/fasta/mac/fasta20u4.cpe.bin  (mac)

        ftp://ftp.virginia.edu/pub/fasta/dos/fa20u416.zip       (dos 16-bit)
        ftp://ftp.virginia.edu/pub/fasta/dos/fa20u432.zip       (win95 32-bit)
        ftp://ftp.virginia.edu/pub/fasta/dos/fa20u4sr.zip       (sources)
        ftp://ftp.virginia.edu/pub/fasta/dos/fa2u4doc.zip       (docs)

Hope this helps.


thorsten burmester wrote:
> Dear all,
> I would like to have your comments on the following idea:
> One often reads in the literature speculations about possible
> relationships of proteins with only some 15 to 20% identity scores.
> Recently, I thought that a possible method to evaluate the
> significance of such low similarity scores would be to randomise the
> sequences of these proteins by keeping the relative amino acid
> composition. If one does this several times (with one or both of the
> sequences), and re-align these randomised sequences with the same gap
> creation and gap length weights, in case this original alignment was
> significant, the new similarity/identity scores should be
> significantly lower. However, if the observed identity is just due to
> similar amino acid compositions, the scores should be similar.
> My questions:
> 1. Does this sound reasonable, and has anybody ever tried a similar
> approach before?
> 2. Do you know any program that can randomise an amino acid sequence
> as described above?
> Thanks for your help.
> Thorsten
> --
> Thorsten Burmester - thorsten at erfurt.thur.de

