Summary: Easier data installation?

Reinhard Doelz doelz at comp.bioz.unibas.ch
Sat Jun 26 11:48:10 EST 1993

In article <1993Jun26.155829.16467 at leland.Stanford.EDU>, cherry at genome.Stanford.EDU (Mike Cherry) writes:
|> WCS used rdist to update their files, and I believe WCS hasn't had an
|> update since Feb 1992. Which does mean the local WCS admin doesn't do
|> quoting Don Gilbert saying, 
|> >It would make the life of overworked sys admins and undertrained biologists
|> >simpler if acedb software could incorportate network links to
|> >a home server which would update local data automatically, more or less.
|> >I'd hope to see this scheme considered for many biology databases, esp.
|> >if they include tailored software like acedb does.
|> We, the AAtDB Project, haven't heard many (if any) complains about the
|> update process. We have biologists without any experience with
|> computers doing the AAtDB updates around the world. Do others have
|> suggestions for making the process easier than it already is? Is this
|> a problem that others are concerned about?

I am working with a general-purpose client/server architecture which allows
for easy data update/transfer. The software runs under the Hierarchical Access 
System for Sequence Libraries in Europe (HASSLE protocol) and updates 
data with a merging process. Tailored procedures for EMBL and other sequence 
databases makes life quite easy. Test bed installations run successful 
already for over a year, wider applications are envisaged this fall. 
Another superior approach for sequence databases is Peter Gad's (Uppsala)
EMBL Network Data Transfer protocol (EMBL NDT), which is used in EMBLnet 
international links. 

The general question, however, is what the users want. Bluntly, biologists
are not informed what they can miss if they don't do the updates right. 

Two alternatives: 

PUSH paradigm: 
A big 'papa' sits somewhere on the network and puts data to all nodes which
subscribed to the service. The customer runs a server process (most trivially,
a FTP server) and the provider runs a client process which connects to this
server in order to push all data needed. EMBL NDT works this way, too. The 
drawback is that the data service provider might need to access the machine 
to be updated at times of inconvenience and/or require privileges to run. 

POLL paradigm: 
A customer runs a client process and polls for new data at the provider's 
service process. As this is like the current NCBI setup of Genbank data, the
easiest implementation would be an ftp client at the customer's site and a 
FTP server at the provider. It is, however, desirable that the customer gets
all data which are not at the customer's site (and only these). Therefore, 
some computation at the server's site is desirable. HASSLE's dbtools software
works this way. 

I would very much enjoy more discussion on this topic. Can someone comment 
on the availability of WCS in Europe, and its implementation (hardware, 
software, reliability)? 

|    Dr. Reinhard Doelz            | RFC     doelz at urz.unibas.ch         |
|      Biocomputing                | DECNET  20579::48130::doelz         |
|Biozentrum der Universitaet       | X25     022846211142036::doelz      |
|   Klingelbergstrasse 70          | FAX     x41 61 261- 6760 or 267- 2078     
|     CH 4056 Basel                | TEL     x41 61 267- 2076 or 2247    |   
+------------- bioftp.unibas.ch is the SWISS EMBnet node ----------------+

More information about the Acedb mailing list

Send comments to us at biosci-help [At] net.bio.net