Program for testing incongruences

Guy Hoelzer hoelzer at unr.edu
Fri Apr 26 11:59:25 EST 1996

In article <4lp0pj$rrp at lastactionhero.rs.itd.umich.edu>, tdib at umich.edu
(Thomas K. Dibenedetto) wrote:

> Tosak Seelanan (tosak at iastate.edu) wrote:
> : I have data sets from the same set of taxa, and these data sets gave 
> : different phylogeneies.  I would like to get a phylogeny based on 
> : combined data set, but I have to prove whether incongruence between data 
> : sets is not significant.  
> Sorry I cant help with Farris' email address, but I was just wondering 
> what "significance" could possibly mean in this context... I mean, either 
> the two phylogenies are the same or they are not. Why would you need to 
> prove anything in order to combine the two data-sets?

Isn't this always the case when one applies statistical analyses to test
for differences between two data sets?  I am not just talking about
phylogenetic statistics.  The means or variances of any two data sets can
differ due to sampling error or because they actually represent different
populations (i.e., different phylogenies).  Phylogenetic data are no
different.  They, too, are samples of taxa, of the extant variation within
those taxa, of characters, etc.  Furthermore, the extant taxa and variation
is a sample of historic taxa and variation.  Therefore, when you have two
data sets, even when they contain samples from identical sets of taxa, they
may differ either due to sampling error or because they are actually
samples of different phylogenies (e.g., sequences from two different genes
in the same set of individuals may have different phylogenetic histories
(see Pamilo & Nei 1988)).  Therefore, it is important to know whether the
data sets really contain conflicting phylogenetic signals prior to
combining the data.

It has been argued that combining data sets with conflicting phylogenies,
caused by the use of different characters, is still a useful way to get at
the phylogenetic relationships of the whole taxa, rather than just of the
set of characters in a particular data set.  The basic idea is that the
areas of conflict will become noise and the remaining phylogenetic signal
will better represent the organismal relationships.  Others argue that it
is inappropriate to combine statistically different data sets.  In this
case, one should examine the set of distinct trees available for the taxa
under study.  The differences among them might be informative and the
similarities are likely to indicate real patterns in the history of the
whole taxa.  BTW, subsets of what is collected as a single data set can
also contain significantly distinct phylogenetic signals; so, the question
of combining data sets is identical to the question of searching for
conflicting signals within any one data set.  I believe that this is a
valuable debate that will continue for some time.

To state my answer to the question posed above (Why would you need to prove
anything in order to combine the two data-sets?) more directly; you don't
need to prove anything, but you might be missing out on interesting and
important information. 

Guy Hoelzer                                               
hoelzer at med.unr.edu
Dept. of Biology
University of Nevada Reno
Reno, NV  89557

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net