I figured out a way of building true inverted indexes for DNA sequences
that only require 4x the size of the original sequence. Of course, as
with any deterministic index, exact searches are extremely fast. But
I've worked out pretty fast algorithms for "near" matches, and i'm
working on "gapped" matches.
But, my understanding of the biology of all this is a little weak.
(I'm just a mere programmer)
I looking for someone who needs more precise searching then FASTA or
BLAST can provide and is interested in a single chromosome or two. (My
indexing system can only handle sequences up to 1 Billon bases, so i've
been indexing chromosomes seperately.)
If you know of someone who wold be willing to work with me, and explain
the biology, please have them contact me.