Dumber DNA Questions

Stodolsky, Marvin Marvin.Stodolsky at science.doe.gov
Wed Oct 24 10:03:53 EST 2001

The English alphabet uses some 26 letters.
Computers get along with just two symbols are the CPU level, 0 and 1
The biological code gets along with four: A, T, G, C
But each of these systems manages considerable dirversity, with strings of
their symbols.

All living organisms (( plants, microbes and animals ) use the A,T, G, C
but can be considerably different from one another.  This just illustrates
that 4 symbols are indeed enough to support great diversity in encoded

Among humans, DNA differences are or the order of 0.1%, most of which is NOT
in protein coding regions, but in spacer/junk DNA regions and without
effect.  There is left a small residium of changes which IN PART reflect the
exciting diversity of humanity.

Even identical twins are not identical in ALL the cellular DNAs.
The immune system matures and diversifies in part by mutation/changes
restricted to certain regions of the immunoglobin genes.
That is why twins can have different allergies, for example.

3.1 billion is a rough average.
Among normal individuals, there can be completely harmless differences in
spacer/junk DNAs
and also in the regions of repeated sequence blocks near the centers
(centromers) and ends (telomeres) of the chromosomes.

See http://www.ornl.gov/hgmis for much useful information.

Marvin Stodolsky
DOE Human Genome Program

-----Original Message-----
From: DJ [mailto:troglodytius-maximus at blackhole.com]
Sent: Saturday, February 17, 2001 1:21 AM
To: autoseq at net.bio.net
Subject: Dumber DNA Questions

Could someone please explain to a novice in this field something which I
don't understand.

I have recently read that...

"the book of the human genome will consist of thousands and thousands of
pages of base pair sequences, such as AATCGGATTCCC...".

This seems to imply that everybody has exactly the same DNA.  Given that we
are all different (except identical twins), even if only slightly, surely
then no such book could have all of these base pair sequences "hard-coded".
So how will this book address that problem?

Secondly I would like to know if everybody (or at least most people) has
exactly the same number of base pairs, which I understand is in the order of
3.1 billion.  If not, how would the book account for any variation?




More information about the Autoseq mailing list

Send comments to us at biosci-help [At] net.bio.net