phylip package question

Chris Hoffmann choffman at lucas.cis.temple.edu
Sat Oct 19 13:40:22 EST 2002

thanks guys, i appreciate you replies.
well, i also talked over with some ppl around here and nobody really had any
idea of how to get around that and since i still wanted to run DNADIST and
see the results....
well, here is what i  did:
i saw that DNADIST saved the dist matrixes for the data sets that were
analyzed (that is: they did not have bootstrapped sequences that yielded
infinite distances) on the output file, up to the moment the program found a
data set with infinitely distant sequences.
them i used other bootstrap sets to run DANDIST and append the result to the
previous file, in the end i had around the number of replicates i wanted
(haven't had the time yet to see if the resulting consensus tree makes sense
or not).
lots of work, but i couldn't think of any other way to go around it.
so here is my question:
is that ok? would this interfere with my bootstrapping analysis because i'm
selecting some bootstrap data sets out?
thanks again,

joe at removethispart.gs.washington.edu wrote:

> In article <and3qk$pm7$1 at mercury.hgmp.mrc.ac.uk>,
> Chris Hoffman  <choffman at lucas.cis.temple.edu> wrote:
> >I have a question regarding the DNADIST program from Phylip Package.
> >I run SEQBOOT with my seqs and get my new data sets produced using
> >bootstrap and so far so good. but when i use these new data sets to run
> >DNADIST, the program can't run it because it finds one or more sequences
> >that are supposedly too different to allow the computation to proceed.
> >I tried all the methods available in the program and all give similar
> >results.
> >btw:  I haven't found any similar msgs running DNAPARS or DNAML
> This occurs for a reason inherent to bootstrapping.  When you have
> sequences, some of which are fairly distant, and bootstrap, two
> sequences can become so far apart that their distance would be infinite.
> For example, when you use a Jukes-Cantor distance, any two sequences
> that are more then 75% different will have an infinite distance.  Thus
> when your original sequences are (say) 70% different, bootstrapping
> can occasionally make those sequences 76% different.
> What should the distance program do in such a case?  I chose in PHYLIP
> to make it complain and stop.  Other peoples' programs sometimes are
> set to assign a large number (say 10) as the distance.  Both of these
> policies have disadvantages.  One denies you the ability to use that
> replicate, the other puts in somewhat fictional information.
> Parsimony and likelihood don't have this problem, though likelihood
> could put a species on the end of a very long branch.  In Dnaml, I
> just have the branch length get fairly long, then at some point the
> program has iterated it enough and leaves it at that length.
> ----
> Joe Felsenstein         joe at removethispart.gs.washington.edu
>  Department of Genome Sciences, University of Washington,
>  Box 357730, Seattle, WA 98195-7730 USA
> ---


