Last year my student Nate did an analysis for me, and left me some Delila
instructions. They included a reference to the locus ECONDH. I pulled out all
other references from the current database, but this one was missing. I went
to IRX and located the new version as ECONDHX. UGH. You see? By changing the
name of this Locus, you made my delila instructions out of date. Locus names
are just about useless! Why not get rid of them?
Now the question is, why did this happen? It turns out that Nate was working
from this paper:
@article{Spiro1989,
author = "S. Spiro
and R. E. Roberts
and J. R. Guest",
title = "{FNR}-dependent repression of the {\em ndh} gene of
{{\em Escherichia coli}} and metal ion requirement for {FNR}-regulated
gene expression",
journal = "Molec. Microb.",
volume = "3",
pages = "601-608",
year = "1989"}
Apparently the sequence of this paper was entered into the database sometime
after it was published. Then someone realized that Spiro1989 referred to an
earlier paper that has the entire gene. So GenBank staff RIPPED OUT THE SPIRO
REFERENCE and replaced it by the earlier one.
So what you say, wasn't that the right thing to do? No. The new reference
gives the location of an FNR binding site. I don't know if this was in the
original entry, but it is not in the entry now, and there is no notation of it.
In other words, the location of the KNOWN site was lost from the database.
This is exactly why I wrote that "Philosophy of GenBank". We are letting
excellent data slip into the void.
The proper thing to do is to include the Spiro reference in the database and
to mark the fnr site. This correction is, however, a mere drop in the bucket.
I am calling for a major commitment to correct this disgusting situation.
Tom "Cassandra" Schneider
National Cancer Institute
Laboratory of Mathematical Biology
Frederick, Maryland 21702-1201
toms at ncifcrf.gov