IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

phylip package question

Des Higgins fatherdes at eircom.net
Sat Oct 12 08:22:31 EST 2002


<joe at removethispart.gs.washington.edu> wrote in message
news:anfqe1$9ns$1 at mercury.hgmp.mrc.ac.uk...
> In article <and3qk$pm7$1 at mercury.hgmp.mrc.ac.uk>,
> Chris Hoffman  <choffman at lucas.cis.temple.edu> wrote:
> >I have a question regarding the DNADIST program from Phylip Package.
> >I run SEQBOOT with my seqs and get my new data sets produced using
> >bootstrap and so far so good. but when i use these new data sets to run
> >DNADIST, the program can't run it because it finds one or more sequences
> >that are supposedly too different to allow the computation to proceed.
> >I tried all the methods available in the program and all give similar
> >results.
> >btw:  I haven't found any similar msgs running DNAPARS or DNAML
>
> This occurs for a reason inherent to bootstrapping.  When you have
> sequences, some of which are fairly distant, and bootstrap, two
> sequences can become so far apart that their distance would be infinite.
>
> For example, when you use a Jukes-Cantor distance, any two sequences
> that are more then 75% different will have an infinite distance.  Thus
> when your original sequences are (say) 70% different, bootstrapping
> can occasionally make those sequences 76% different.
>
> What should the distance program do in such a case?  I chose in PHYLIP
> to make it complain and stop.  Other peoples' programs sometimes are
> set to assign a large number (say 10) as the distance.  Both of these
> policies have disadvantages.  One denies you the ability to use that
> replicate, the other puts in somewhat fictional information.

I am guilty as charged here seeing as we did that in clustal (insert
imaginary large value of 10).  It is ugly and I am not proud of it.  I did
try to insert warnings all over to say that it is dangerous.

Des Higgins

Dept. Biochemistry
University College Cork
Ireland

>
> Parsimony and likelihood don't have this problem, though likelihood
> could put a species on the end of a very long branch.  In Dnaml, I
> just have the branch length get fairly long, then at some point the
> program has iterated it enough and leaves it at that length.
>
> ----
> Joe Felsenstein         joe at removethispart.gs.washington.edu
>  Department of Genome Sciences, University of Washington,
>  Box 357730, Seattle, WA 98195-7730 USA
> ---
>

---




More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net