Medline etc ...

Paul Gilna pgil at TEMIN.LANL.GOV
Tue Jun 9 17:31:04 EST 1992

> (Asbestos suit on)
> (Asbestos suit off)

click! click! hssssss... Whumpfh!

couldn't resist it, sorry!

I'm probably crossing all sorts of boundaries here, but I think this
needs to be said. Regardless of Medline/Genbank/Entrez as the data
source, the sentiment that I am picking up is that centralised access
to the data is essential.

my belief is that the experience we have here dictates that we must

If you look at the services provided by genbank today, you will see
that both capabilities are provided; tape/CD/network distribution (with
or without interim updates) and on-line access by both direct dial,
internet and email server.

While neither system can be claimed to be perfect, its clear that there
is a strong preference to the login/server path than the off-line
access, and that we are doing a good job in meeting the demands on this

Yet look at the evidence we are facing today; the available servers are
processing literally 1000's of queries per day, and this rate is
climbing with no sign of abating, GenBank is already placing load
limiters on the retrieval queues, and I would guess that one if not
more of the CPU's dedicated to this service is in permanent FASTA
mode.  Together, these factors dramatically increase the significance
of the consequences of a system failure (or even planned downtime) to a
community becoming increasingly dependant on a centralised data
distribution mechanism--cd-roms may be slow, but you're going to get
er, annoyed, when you cannot get your FASTA results back because you
are behind 500 other jobs and you are three time zones away.

And I'll bet that's only a glimmer of what would happen if the entire Medline
user community could suddenly dial or internet in to NLM!

There's no blame to be laid here--everyone involved is doing a good job
with the resources to hand--I believe we are seeing a natural evolution
that will ultimately lead to the breakdown of centralised data
distribution systems--user demand will outpace the system's capability
to cope.

Beyond providing a form of distributed data access, I dont know whether
todays CD-ROM's per se provide the solution, but CD's are easy to
produce (easier than tape, anyway) and new technology be it optical or
magnetic (eg helical-drive based systems) can only serve to drive the
cost per MB down and the speed of access up. The price of a drive will
seem a small price to pay when guaranteed access to the data is at

(Remember also that decentralisation does not necessarily have to
devolve entirely to the desktop workstation/removable media but could be a
combination of local (eg, desktop), institutional (eg university) and
satellite-node (eg state-wide network server) networked access.)

I believe that (at least in GenBank's experience), like it or not, only
by decentralising the storage and CPU demands of data access can we
hope to supply the demands of our user-base, and that that is where we
must devote our energies--the trick will be to provide the same
functionality, particularly rapid updating and speed of access, that
today's centralised systems (so far) enjoy.


More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net