Let A= score. m=estimate of mean of randomized scores.
s= estimate of standard deviation of randomized scores.
t= students t (see below).
If A > m(+ or -)ts then it is significant. t will in general
depend on the confidence limits (How sure you want to be
that it is not random) and the number of ransomization
samples that you took. For example: For a large number of
random samples t= 1.96 for 95% confidence limits. Two
good books are "Analytical Chemistry" by Skoog and West
and "Statistics for Chemists" by Youmans (I think). The
above treatment only hold for a completely random model
of biochemical sequences - something that is not really
valid. More detailed methods of estimating statistical
validity appear in the work of Karlin and coworkers but I
don't understand their work. The most sensible easy
criterion I have found is that if identity > 25% and length
greater than 80%, the sequences are structurally related.
I hope that this was of soem help.