IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Molecular Clocks and HIV Origin

Tom Keske tkeske at mediaone.net
Fri Jun 2 21:17:28 EST 2000


Los Alamos researcher Bette Korber recently did computer
calculations that estimated the origin of HIV as being
in approximately 1930, plus or minus 20 years.

Using 160 strains of the virus, Korber created what she calls a
"molecular clock" to see how the viral mutations developed over
time from a "common ancestor".

The "molecular clock" theory says that HIV mutates at a
predictable rate, so that degrees of genetic differences among
strains can produce an estimate of evolution time.

The media tended to hail this study as a major breakthrough, and
did not adequately explain the degree of controversy surrounding
the methodology.   This essay will describe a number of
controversial issues surrounding Korber's methodology.


The theory of genetic evolution as used by Korber is a departure
from the theory of evolution as commonly understood in the Darwinian
sense of the word.  Korber's model assumes the "neutral mutation
hypothesis", which says that the rate of evolution is equal to the
rate of mutation.   In other words, it does not factor in "natural
selection" as a mechanism for determining which strains of HIV will
be viable.

Korber's model does not try to analyze which new strains or
mutations in HIV would survive better than others.   It assumes
that all mutations are essentially equal in their likelihood to
survive.  In other words, there is no principle of "survival
of the fittest."

The "Neutral Theory" was developed by Kimura in the 1960s.  Kimura
claimed that most molecular evolution was the result of genetic
drift, not selection [2]

The "Neutral Theory" has not been without controversy [3] [4].

If Kimura is correct then most changes at the molecular level are
neutral and the rate of neutral mutations is high, if the
selectionist are correct then most changes have a selective
advantage and neutral mutations are rare [2].


An example of the controversy concerning neutral evolution is the
major conflict between molecular biologists and paleontologists
concerning mammalian evolution.

Humans can trace the origins of many of their mammalian relatives
back either 65 million years or 130 million years, depending on
which group of scientists they choose to trust [7].

The fossil record suggests that placental mammals (including humans)
first evolved around 65 million years ago.  However, according to
a molecular clock analysis, the same group of mammals should
have evolved 130 million years ago.

As Chicago paleontologist Michael Foote puts it,

   "We find that the quality of the fossil record is something
   like 10 to 100 times greater than the quality that would be
   required by this hypothesis of missing species diversity,"
   said Foote, Associate Professor in Geophysical Sciences at
   Chicago. "It's such a large discrepancy, we ended up
   concluding that it's difficult if not impossible to
   maintain that there are 65 million years of fossils missing
   from the history of modern placental mammals. This result
   calls into question the use of a strict molecular clock to
   date the origins of major biologic groups."


There is not only controversy surrounding "neutral evolution" and
molecular clocks in general, but there is also reason to question
whether neutral evolution would apply specifically to HIV, in any

A 1997 study concerning HIV evolution, performed at the
Aaron Diamond AIDS Research Center, concluded that selection
did in fact play a significant role [8]:

   "This strongly argues for the dominant role that positive
    selection for amino acid change plays in governing the
    pattern and process of HIV-1 env V3 evolution in vivo and
    nullifies hypotheses of purely neutral or mutation-driven


Korber's model does not assume that "molecular clocks" are
completely regular over short-term intervals, but it does
assume that they are relatively constant and predictable over
long periods of time.  It assumes that the pace of the molecular
clock will not change significantly, in unpredictable ways,
as new strains develop.

This assumption remains controversial.

As described in "The Molecular Clock Problem", molecular clocks
sometimes behave in an erratic manner which calls into question
their use" [4].

A similar conclusion was reached in a study by researchers at
the University of California: "The great variation in the rates
of the molecular clock raises questions concerning whether it
can be used to infer evolutionary time from contemporary sequence
differences." [5]

Korber herself has admitted limitation in the use of molecular
clocks, as applied to considerations of HIV-1 origin [6].


Another problem with "molecular clocks" is that they cannot
easily account for recombination events, which take place more
unpredictably.   A "recombination event" is where two related
strains combine their genetic material and produce a hybrid.

This can represent a "fast forward" evolution, in comparison
to the length of time that it would have taken either strain,
by itself, to mutate into an equivalent of the new hybrid,
simply under the power of their "molecular clocks".

Studies have indicated that recombination is an important
factor in the evolution of HIV.  For example, as described
by a CDC study [9]:

   "Recombination may be an important fitness search strategy
    in the ongoing evolution of HIV. Many of the strains around
    the world appear to have arisen through recombination."


It is conceivable for a virus to have a rate of evolution that
is independent of the number of hosts that are infected.  It has
been proposed that HIV might be in this category, and Korber's
model would be in large measure dependent on this assumption.

However, this is yet another assumption that is subject to question,
and would require demonstration.

HIV is phenomenal in its capacity for mutation.  On average,
each individual reproductive cycle that produces a new virus
has approximately one mutation [9].

Within a single infected individual, genetic variation caused
by mutations can reach 20% of the entire HIV genome [10].

In view of the extreme amount of per-host genetic variation
that occurs, it would seem to make intuitive sense that the
rate of evolution of new HIV strains would be sensitive to the
number of people who are infected.  Because mutations are random,
and many different mutations are possible, it would be very
unlikely for any two individuals to be giving rise to the same
new viral strains, simultaneously.

It would seem reasonable to expect that many more new viral
strains would emerge in a given period of time, for example,
if 20 million total people are infected, rather than only
20 people.

Korber's model does not claim to predict with accuracy how many
human beings might have been infected around the world,
in any given year, prior to the known outbreak of the epidemic.
Such estimates have been the subject of much controversy, and
vary widely in range.


Not all quoted statistical confidence intervals are equally
meaningful.   A confidence interval exists in the context of
certain stated assumptions.   The validity of the confidence
interval depends on the validity of the assumptions.

When you are computing something simple, like the chance to draw
10 white marbles from a bag of mixed-color marbles,
the assumptions are few and simple: that you know how many
marbles of each color, that the marbles are evenly mixed, that
they are drawn randomly, without peeking.

For Korber's study, the assumptions are more numerous and
far more controversial: that the theory of "neutral
evolution" is correct, that molecular clocks are
predictable, that viral evolution rate is independent of
the number of infected hosts, that recombination events are
not significantly "fast-forwarding" the evolution.

Thus, the claim of "95% certainty" for even a 20-year margin
of error, is more suspect.

A software package called PHYLIP (Phylogeny Inference Package),
which performs computations similar to those in Korber's study,
contains the following caveat [11]:

    important thing to keep in mind while running any of the
    parsimony or compatibility programs is not to overinterpret
    the result. Many users treat the set of most parsimonious
    trees as if it were a confidence interval.  If a group
    appears in all of the most parsimonious trees then they
    treat it as well established. Unfortunately THE CONFIDENCE
    SET OF ALL MOST PARSIMONIOUS TREES (Felsenstein, 1985b).
    Likewise, variation of result among different methods will
    not be a good indicator of the size of the confidence


Clearly, there is more room for question and criticism of Korber's
study and of 1930s HIV origin, than has been made clear by the media.

Ironically, Korber's study highlights the cavalier attitude of the
media, in other prior claims that they had made such as
in this one by ABC News:

   "People in West-Central Africa have probably been dying of
    AIDS for thousands of years, and most likely they contracted
    the HIV-1 virus  by hunting chimpanzees for meat." [12]

If Korber's study is anywhere near correct, then the above media
claim would be grossly off the mark.

It seems to be part of a familiar pattern with HIV-related data, in
general.  The date of origin has been suggested anywhere from the
1970s, to hundreds of years ago, to thousands of years ago.
Incubation times have been said to vary by a factor of 50 or more.
Infectivity rates have been said to vary by factors of up to 5000
or more.

The recent experience at the March on Washington might help to give
perspective on one possibility for how this confusion and
contradiction might arise.  In estimating the size of the crowd,
organizers cited 750,000 while conservative sources claimed only

One would think that this would be a purely "scientific", objective
matter, also.   There are techniques for crowd estimation, that
should be more accurate and less dubious than the methods used at
Los Alamos labs for estimating HIV origin.

You know very well what the real issue is: not a "scientific
disagreement", but a matter of propaganda, based on political
agendas, creating a confusion that is not resolved by the media
(perhaps as likely *caused* in part by the media).

It is similar to confusion on the death total of the Mai Lai
massacre, where different sources list different numbers for how
many killed, varying as much as 100 percent.  If you want to
minimize an embarrassing scandal, you knock the number down.  If
you are at the other end of the political spectrum, and want to
expose a scandal, you pump the number up.

Analyzing motives can be as important as analyzing the data.
Judging the timing of a media event is as important as judging the
timing of a molecular clock.

There was an agenda to the Los Alamos study: to downplay the
successful book by Edward Hooper, "The River", which documented the
likely link between polio vaccines and the outbreak of AID in

It is likely to be as much for the sake of a politicized agenda, as
for reasons of mere scientific naivete, that the media put such
total credence in a study where the controversy should have been

Tom Keske
Boston, Mass


[1] http://gday.educ.mun.ca/aaron/4241/Clock2.htm

[2] "The Neutral Theory"

[3] "Nearly neutral evolution: fact or fantasy?"
    (Washington State University)

[4] "The Molecular Clock Problem"

[5] J Mol Evol 1998 May;46(5):552-561
    The Molecular Clock Runs at Different Rates Among Closely Related
    Members of a Gene Family.

    Gibbs PEM, Witke WF, Dugaiczyk A

    Department of Biochemistry, University of California, Riverside, CA
    92521, USA

    PMID: 9545466

[6] "Limitations of a molecular clock applied to considerations of
     the origin of HIV-1"

     Korber B, Theiler J, Wolinsky S

     PMID: 9669945, UI: 98329785

[7] http://www-news.uchicago.edu/releases/99/990224.mammals.shtml

    "Scientists devise method to address conflict between molecular
     clock, fossil record of mammalian evolution"

[8]  J Virol 1997 Mar;71(3):2555-61
     "Host-specific driving force in human immunodeficiency
      virus type 1 evolution in vivo"

     Zhang L, Diaz RS, Ho DD, Mosley JW, Busch MP, Mayer A

     PMID: 9032400, UI: 97184598

[9] "Recombination in HIV: An Important Viral Evolutionary Strategy"


[10] "The fastest genome evolution ever described: HIV variation
      in situ."


[11] "PHYLIP (Phylogeny Inference Package) Version 3.57c"

[12] http://abcnews.go.com/sections/living/DailyNews/aids990131.html

    The Beginning of AIDS, By Bill Brewster, ABCNEWS.com,
    Jan. 31, 1999

More information about the Microbio mailing list

Send comments to us at biosci-help [At] net.bio.net