IUBio

Judging seq. similarity??E-mail addr. included.

Brian Foley btf at t10.lanl.gov
Fri Nov 22 19:02:47 EST 1996


campanelli lab wrote:
> 
> I'm pretty statistically ignorant. What are some good rules to use in
> comparing two aligned sequences similarities or percent identities with
> a randomized version of one. For example, two sequences have 34%
> sequence identity in a pileup. After randomizing one of the sequences
> this falls to 15%. What is a good way to judge the significance of this?
> Any references would be appreciated. Thanks.
> 
> Steve Johnson
> Biochemistry
> Univ. of Illinois
> sljohnsn at staff.uiuc.edu

	Biology does not always pay attention to statistics.
There are some genes with little similarity that have the
exact same function.  There are other genes that are nearly
identical and have oposing functions (one DNA binding protein
may be a transcriptional activator and the other a 
transcriptional repressor).
	The simple measure of similarity or sequence identity
is a good start, but we would also like to know:
Is the similarity evenly distributed throughout the
genes, or are their conserved domains, seperated by 
variable regions?
Are these two genes from the same species, or are
you comparing a human gene to an E. coli gene?
Are the first two positions of the codons more
likely to be conserved than the last?  (What
is the synonymous/nonsynonymous subsitution ratio?
this is not very useful when sequence identity is 
less than 60%).

	There are a number of ways to get statistically
significant similarity.  Two genes could convergently evolve
toward a similar sequence.  A single gene can duplicate and
diverge within a single species.  A single gene can diverge between
wo different species.  Parts of one gene can recombine with 
another.

-- 
 ____________________________________________________________________
|Brian T. Foley                btf at t10.lanl.gov                      |
|HIV Database                  (505) 665-1970                        |
|Los Alamos National Lab       http://hiv-web.lanl.gov/index.html    |
|Los Alamos, NM 87544  U.S.A.  http://hiv-web.lanl.gov/~btf/home.html|
|____________________________________________________________________|



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net