Before this goes too far, let me note that I said that the paper may
be good science but you can't tell from the paper. I stand by that
statement. If the journal cuts down a paper it is still the
responsibility of the authors to make sure the fundamental parts of
the scientific method (e.g. reproducibility) are intact.
> >> You're right, the insertion cost equation: a + bx does NOT
> >> make much sense.
> >I beg to disagree. Try reading the literature. Fitch and Smith, PNAS,
> >March 1985, Optimal Sequence Alignments.
All I meant by this was there was some justification that one could
understand by reading the paper. I did not mean to say that there is
nothing better; the extensive literature on gap costs by Altchul,
Lipman, and others shows that there are better choices in specific
situations.
Gaston Gonnet said:
> This is the subject of another paper, we never tried to explain the
> new deletion model in the Science paper. The deletion paper is
> titled "Empirical and Structural Models for Insertions and Deletions
> in the Divergent Evolution of Proteins". I will send a preprint to
> anybody interested (please request by e-mail to oetting.inf.ethz.ch).
> These results were presented at EMBL and more recently at NIH.
I would like to receive this paper. But since the deletion model was
rather important for the evaluation of the science in the paper, it
would have been good to mention it!
[important information allowing one to evaluate the method omitted]
> We do not talk about "PAM matrices", we talk about mutation matrices
> computed at a given PAM distance and Dayhoff matrices, again computed
> at a given PAM distance. Are you sure you understand what is what?
Yup. Are you sure you know how your work will be interpreted? If not,
perhaps a posting here explaining why (or why not) your mutation
matrix (Fig. 2, pg. 1444) should not be used in place of the PAM 250
would be appropriate. A lot of people are *assuming* that is what you
meant and *are* using it that way.
> Our Dayhoff matrices, for any PAM distance, have been derived from
> two orders of magnitude more information than most earlier efforts.
And that is useful.
> We have found that our alignments, in particular for distant relations,
> are better than the ones obtained with other matrices. For closer
> relations, any scoring matrix works well, even an identity matrix works!
We may be wandering into semantics here, or just misunderstanding each
other. I, and a number of people who have sent me e-mail privately,
would want "distant relations" extremely carefully defined before
believing this statement. Again, it may be true, but it is open to a
few different interpretations.
> If you are interested in understanding the theory of proper sequence
> alignment, I can send you a few chapters that explain this in detail.
I am very interested, of course. It would have been very useful to
say in the paper that there are technical reports or other info which
explores this topic further. The entire content in the article is "The
final outcome of the [...] using the DARWIN [...] (19)." That really
doesn't say that the subject is more carefully explained in a
technical report.
There are many people (again judging from private e-mail) that are
very confused by the Science paper. I would like to encourage you to
write a more comprehensive paper and put it in Molecular Biology and
Evolution, J. Mol. Biol., or Bull. Math. Biol., where the community
can really evaluate the work.
dan davison
--
dr. dan davison/dept. of biochemical and biophysical sciences/univ. of
Houston/4800 Calhoun/Houston,TX 77204-5934/davison at uh.edu/DAVISON at UHOU
-----RIP Isaac Asimov 1920-1992 I'll miss him --------------------
Disclaimer: As always, I speak only for myself, and, usually, only to
myself.