Locally Mounted Databases

Ken Baker ken.baker at bbsrc.ac.uk
Tue Nov 5 08:33:15 EST 1996

In article <devjani-0211960238260001 at devjani1.jci.tju.edu>,
devjani at calvin.jci.tju.edu (Devjani Chatterjee) wrote:

> In article <552p3u$j7c at netnews.upenn.edu>, ellis at vesicle.dental.upenn.edu
> (Ellis Golub) wrote:
> > Until recently, we mounted the full GCG database suite locally.  Then we ran
> > out of disk space.  We are now considering whether to permanently stop
> > mounting the databases locally, or to increase the disk capacity of our
> > system to once again contain the databases.
> > 
> > I am intersted in the experiences/soutions taken by other sites.  In
> > considering this issue, some users report that they can search the public
> > databases more quickly than they had been able to search the local databases
> > previously, and they worry that as the size increases, searching speed will
> > decrease further. 
> > 
> > I would very much appreciate hearing from users and system administrators on
> > this subject.   
> > 
> > --
> > Ellis Golub                  Phone: (215) 898-4629
> > Biochemistry Department      FAX:   (215) 898-3695
> > University of Pennsylvania   ellis at biochem.dental.upenn.edu
> > School of Dental Medicine             
> > 4001 Spruce Street
> > Philadelphia, PA 19104-6003
>          This is from downtown Philadelphia ! 
>          We maintain all the databases locally. Genbank is updated weekly.
> We work with a SGI and till now have the databases (GenBank, Genpept, PIR,
> Swiss) on one disc and PDB on another.I will shortly have to move the
> protein databases to accomodate GenBank.
>          I am curious how the GCG programs will work without local
> databases. For eg. 'stringsearch' or even 'fetch' ? Ultimately, ofcourse
> all these
> programs will have to work over the network.
>          Regards!

We have got stringsearch and fetch working over our network.

What we've done is this. In the BBSRC (Biotechnology and Biological
Science Research Council) here in the UK, we have a number of Institutes,
and even more laboratories, spread country wide. The BBSRC used to
maintain multiple copies of EMBL and Genbank etc, one at each of the
Institutes which ran GCG. A couple of years ago, it started to get obvious
that disk space was going to get to be a problem so we, at Computing
Centre, decided to write a client/server system. 

So, now we've got MoBiCS clients running on the vaxen or alphas at the
Institutes. The idea is that the user fires up fasta, stringsearch, fetch,
whatever, and actually gets the MoBiCS version. This looks exactly like
the GCG version (no training issues), but just sends the job to a fast
alpha, with lotsa disk space, sitting here at Computing Centre, running
the MoBiCS server. That runs the fasta, stringsearch or whatever, job, and
returns the result file back to the users diskspace, transparently.
Because we were using a fast alpha, we saw a fairly substantial decrease
in the turnaround time for a fasta job, when we installed it all.
Currently we only run it over our own wan, agrenet, but we are looking to
running it over the internet in the future.

We carry GenEMBL (nonredundant Genbank + EMBL) in GCG format, BLAST
indices for Genbank and EMBL, PIR, Swissprot, TrEMBL plus a few smaller
ones. We are projecting, in 18 months time, 70 Gb, including processing
space. Makes you think, doesn't it?



Dr Ken Baker                          Internet: ken.baker at bbsrc.ac.uk
B.I.T.S.                       http://www.cc.bbsrc.ac.uk/biolist.html
BBSRC Computing Centre            BritishTelecomnet: (+44)1582 762271
West Common                                  Faxnet: (+44)1582 761710
Harpenden AL5 2JE                           ICBMnet:  51'48'N 00'21'W

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net