>Subject: Treatment of deletions in IT measures on biosequences?
>Message-ID: <1992Mar20.233040.4966 at ctr.columbia.edu>
>Originator: shenkin at avogadro.barnard.columbia.edu
>I want opinions on the following question:
> Should deletions in (some sequences in a set of) aligned
> sequences be considered "types" for the purpose of calculating
> information-theoretical uncertainties?
.........stuff deleted.....
>Thus, treating deletions as types appears to be useful for picking out
>inter-group heterogeneity; but if I use it there, it seems to me that
>I ought also to be using it within groups, for consistency.
>Thoughts, anybody?
>************************f*u*cn*rd*ths*u*cn*gt*a*gd*jb*************************
>Peter S. Shenkin, Department of Chemistry, Barnard College, New York, NY 10027
Since Peter mentions his computer program, let me point out something of
possible interest to programming-ophiles:
Finding String Distance: A rose by any other name has a
particular Levenshtein distance.
Ray Valdes
Dr. Dobb's Journal 17(4):56-pp Apr 1992
Lead paragraph:
"This article discusses the theory and practice of sequence comparison. This
topic has lurker unnoticed on the sidelines of computer science, but has
proved tremendously important in biotech research and may now have
widespread application in the areas of handwriting and speech recognition."
Dr. Dobb's, if you are not already a subscriber, is available at better
newstands.
FASTA is mentioned at the end.
Another article in the same issue is:
Your Own Handprinting Recognition Engine: Algorithms for putting
pen to pad.
Ron Avitzur
Dr. Dobb's Journal 17(4):32-pp Apr 1992
Lead paragraph:
"This article discusses the design and implementation of a writer-dependent
recognizer for hand-printed text. This recognition engine forms the basis
of a pen-based interface to a symbolic math program. My recognizer is
distinguished by its small size and straightforward implementation, which
is nevertheless able to achieve high character accuracy......."
"Ron completed undergraduate physics at Stanford in 1990. While there, he
developed Milo and the math engine used by FrameMaker, the only page-layout
program which can symbolically evaluate derivatives...."
I noticed that Ron doesn't rely on neural-nets, but rather hashing functions.
Dr. Dobb's also announces it's first ever DDJ Recognition Contest:
programmers, start your engines (algorithms will be evaluated against the
extensive database of handwriting samples at GO Corp.).
It may be worth keeping an eye on the results! There may be approaches
of interest to sequence searching and motif recognition.
--Steve
---
+------------------------------------------------------------------+
| In person: Steve Modena AB4EL |
| On phone: (919) 515-5328 |
| At e-mail: nmodena at unity.ncsu.edu |
|samodena at csemail.cropsci.ncsu.edu |
| [ either email address is read each day ] |
| By snail: Crop Sci Dept, Box 7620, NCSU, Raleigh, NC 27695 |
+------------------------------------------------------------------+
Lighten UP! It's just a computer doing that to you.
OOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOOO