IUBio

New FASTA versions available

William R. Pearson wrp at dayhoff.med.Virginia.EDU
Sat Feb 12 11:55:29 EST 1994


	A new release of the FASTA program package, version 1.7, is
now available from virginia.EDU in pub/fasta as fasta17.shar(.Z).

	This version replaces the "rdf2" and "rss" programs with
"prdf" and "prss", which calculate more accurate estimates for the
statistical significance of a similarity score based on the scores of
randomly shuffled sequences.  The earlier "rdf2" and "rss" programs
calculated a "z-value," which is not very informative if the
distribution of similarity scores is not normal.  Sequence similarity
scores for random sequences are distributed according to the extreme
value distribution, which is quite different from the normal
distribution, especially for high scores. Prss and prdf estimate the
parameters of the extreme value distribution and use these parameters
to calculate the probability that a score as good or better than the
unshuffled sequence score will be obtained.  I appreciate the help of
Stephen Altschul, who showed me the error of calculating "z-values",
and the help of Phil Green, who provided the extreme value
distribution estimation routine.

	Statistical estimates based on the extreme value distribution
are usually more conservative than earlier estimates based on
"z-values."

In addition, a bug in the alignment routines that caused error
messages and core dumps on some machines has been fixed.  This bug has
also been fixed in version 1.6. The new 1.6 version is available as
fasta16c32.shar(.Z).

Bill Pearson




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net