Local Databases [confusion/?]

Reinhard Doelz doelz at comp.bioz.unibas.ch
Fri Feb 17 02:25:47 EST 1995

jsmith at MERLIN.MED.ECU.EDU wrote:

: >databases....!!!  At $US2000 per annum to maintain the database, it
: >is WAY outside of our budget, so we have simply done without
: Now I am confused??  I was under the impression that one could purchase
: the GenBank database from NCBI for a very reasonable fee  (<$250 US/yr)
: Then the idea was you could convert this to GCG format without too much
: trouble.  Certainly the cost of maintaining an onsite database at these 
: prices would out weigh the time required to convert to GCG format.

To avoid misunderstandings, both EMBL/SWISSPROT and GENBANK come in 'generic'
format on CD in very reasonable prices (PIR international, in contrast, 
is four times that expensive). You may get these via electronic networks 
as well using the 'ftp' program. 

FULL DATABASES: every two or three months on CD. Require up to a month for 
shipping depending on database and destination (typically, less than 2 weeks). 
There are three issues: (1) To transfer the databases, you need a fairly fast 
network connection (64k are NOT sufficient), as the compressed DNA databases 
occupy about 250MB in volume each. (2) You need to uncompress the data into 
typically three times the disk space. (3) GCG reformatting will get you the 
same volume in GCG format. Worst-case scenario, therefore, is that you need 
3.3 times the disk space than the real database: Production version, com-
pressed, uncompressed, GCG formatted new version. Best-case (but rudimentary 
as no backups ), no continuous work allowed, thus a bit of script work and 
out-of-order GCG during the process: approx. 1.5 times the disk space: delete 
entire previous production version, ftp new compressed, and do uncompression/
formatting for each section individually.

UPDATES: most (except PIR) have updates installed to be transferred via net
or other parties (like EMBnet) take care of this. 
There are two issues: Either you transfer the full, 'cumulative' set since 
the last release and treat this like the full database as descripted above.
OR, have the 'incremental' transfers and try to keep your own updated set. 
In particular, the latter causes significant resource consuption if you really
want to be in sync with the major databases. 
Note that UPDATES do not come in CD/tape format unless purchased with specific 
arrangements, from Industry or other. That is no longer 'cheap', though. 

To make one point clear; all own updating requires local resources (staff, 
disk, maintenance). If you cannot get service from your nearest national 
EMBnet node, or if you have to buy the service, then GCG's offer is worth 
considering, as are potentially other updates or other packages. However, 
if you want weekly or even daily updates, you MUST have your own setup. 

Based on experience, the average institute size which you can afford to run
an own service is 300-500 scientists (typically, half of them using GCG). 
This is not to say that you can do with less, but I am afraid the decision 
then is based on scientific and political rather than economical considerai-

 We have run (and still do for some purposes) netwok searches towards the 
NCBI. The biggest problem is to find a homogenous environment which is suited
to get you the data found on the disk in GCG-utilizable fashion. As mentioned
earlier, quick-search and single-entry retrieve are fine (you might want 
to try out the SRS browser system on 12 servers with 60 databases, too - 
try http://www.embl-heidelberg/srs/status.html or, alternatively, 
 http://www.ch.embnet.org/srs/status.html). Reasonable searches and biologi-
cal result evaluations, however, seemed to be impossible without up-to-date 
databases locally. The EMBnet takes care of the 'small sites' issue in Europe, 
and has proven to be beneficial economically and scientifically. If you 
reside in Europe and haven't noted who your next node is, try the most 
recent newsletter on http://www.ch.embnet.org/embnet.news/ which has also 
a section with a list of embnet nodes, their services and contact addresses.

Reinhard Doelz
EMBnet Switzerland

 R.Doelz         Klingelbergstr.70| Tel. x41 61 267 2247  Fax x41 61 267 2078|
 Biocomputing        CH 4056 Basel| electronic Mail    doelz at ubaclu.unibas.ch|
 Biozentrum der Universitaet Basel|-------------- Switzerland ---------------|
<a href=http://beta.embnet.unibas.ch/>EMBnet Switzerland:info at ch.embnet.org</a> 

