Just something I've been curious about:
I use NCBI's BLAST network server with the "nr" non-redundant
sequence database. It's a wonderful service, but in the output
I sometimes see what appear to be redundant entries. What scheme
is used to cluster duplicates between the component databases, and
how sensitive is it?
On a different note, it seems that the EST database may only
be searched at the DNA level. When will searching against possible
EST translation products be available? Also, does the EST database
cover only human ESTs, or C.elegans ESTs also?
Keith Robison
Harvard University
Program in Biochemistry, Molecular, Cellular, and Developmental Biology
robison at ribo.harvard.edu