IUBio

Quo vadis Gopher ? (Re: List of BioGophers)

Reinhard Doelz doelz at comp.bioz.unibas.ch
Tue Oct 27 16:58:10 EST 1992


In article <1992Oct27.072456.13689 at nic.funet.fi>, harper at convex.csc.FI (Rob Harper) writes:
|>   
|>   So I would like to ask how people feel about mirroring of resources.
|>   Is it enough that everyone in Europe pounds Don's GenBank resource.
|>   Is it enough that everyone in USA pounds Reinhard EMBL resource...
|>   and everyone in the world jumps on Dan's PIR resource, and when
|>   these centres run into service glitches then nobody has anywhere to
|>   go. Do we need duplication??? What should we duplicate??? The BIG
|>   databases or the smaller ones like ace and aat??? Who is going to
|>   have the disk space to provide the service???
|> 

Good point. Let me tell you that the indices of GOPHER grow as fast as the 
databases do. Because GOPHER is free, people think that it is cheap. 
Well, nearly... just the EMBL databases alone  make about 50MByte: 
4490    /bioy/gopher-data/index/embl/fun
9925    /bioy/gopher-data/index/embl/inv
4038    /bioy/gopher-data/index/embl/mam
3384    /bioy/gopher-data/index/embl/org
1134    /bioy/gopher-data/index/embl/phg
5242    /bioy/gopher-data/index/embl/pln
22777   /bioy/gopher-data/index/embl/pri
12398   /bioy/gopher-data/index/embl/pro
17139   /bioy/gopher-data/index/embl/rod
1827    /bioy/gopher-data/index/embl/syn
3232    /bioy/gopher-data/index/embl/una
9898    /bioy/gopher-data/index/embl/vrl
4705    /bioy/gopher-data/index/embl/vrt
100191  /bioy/gopher-data/index/embl

Plus, 
18858   /bioy/gopher-data/index/xembl
14041   /bioy/gopher-data/index/xxembl

the daily updates. As mentioned earlier, hosts or resources which are 'down'
are frustrating. Therefore, during the daily update procedure, I run the 
databases DOUBLE and after crosscheck I rename the paths. Getting me currently
a 100 MB allocation for EMBL database indexing alone. If you gonna tell that
this is little, I agree but remind you that the BLAST and GCG formatted 
database also take their share.  Not to count the resources to have the 
CD Roms mounted (currently 3 drives at the bioftp server). 

In total, including scratch and redundant archive space, I currently allocate
5-6 Gigabytes of Disk on several computer systems for providing GCG, 
BLAST, and GOPHER. We anticipate a doubling rate of 16 months for EMBL. 
Its time to think about who doubles what. Certainly, the small sites 
will loose just from material standpoint. I worry at the time whether I 
become a small site soon or later, because our funding agencies might not like
to throw money into a center which serves the world rather than the own 
guys exclusively. 


|> 2)Lost in gopher-space... How many people have had this experiance.
|>   You read about a resource. You track it down. You negotiate all
|>   the menus, and finally you reach your destination. You fail to make
|>   a bookmark, and the next time you try to navigate to the same place
|>   you can never find it. What can we do to make moving about in
|>   gopher-space a "memorable" experiance. Do we need a standard bio-gopher
|>   interface (/databases, /software, /hints) that looks the same on
|>   every machine, or do we need weird flashing neon lights and steam
|>   whistles (Desperately Seeking Suzan) to provide hooks that jolt
|>   our memories into remembering "hey I've been in this place before".
|>   I have gone in the latter direction, renaming "Name=" to something
|>   more graphic and descriptive that I have found in the original link.
|> 

Bookmarks are fine toys to keep this up and regulated. It certainly 
is needed to organize a 'who has what' database. This should, then, be 
searchable via a fanout mindexer like I already use it now for the 
EMBL subsections. I think we should have a index type gopher 
sitting at some (agreed) free port and be adressable by such a 'who has what' 
gopher.  The problem is updating, though. It would be cute to have
a archie type of polling for a .resources file which then could be maintained
automatically. Even automatic sorting would be possible.

Next, I would appreciate to hear who has what and who has links. Quite a 
few holes around the globe have an 'About Gopher' item locally and the 
rest are links. Whereas this seems to be most comfortable, it hides the 
cost these poor guys out there have to provide services. 

Classification of services is also an issue. It helps little if someone
puts up a server and doen't update it regularly. From own experience 
I know that some activities are short-living at best. However, services 
like Don's Genbank etc. become sort of standard in the habits of some of 
us, and browsing through the accounting logs I find that about  80% of the 
queries come from 10% of the nodes. I would appreciate if these nodes think 
about having their own resources set up, and/or made these available to 
the public, under the control of established data sets to make sure that 
DUPLICATION and not imitation is the goal. Nothing is worse than a bad 
copy!

In summary, not a bad idea to have a schema. However, the manpower needed 
to set it up (AND MAINTAIN!!!) is more than I can currently get as 
funding for such a project. The worst thing in all that is that the 
individual sites are severely depending on local funds, thus being very much 
restricted in setting up 'global' services.  I currently don't block any 
access but from last month to now we had several gigabytes coming off the 
bioftp gopher/ftp/hassle system, and this is causing some people to 
think of future access restrictions. Again, I currently have no intention 
to block access, but in the future service providers like us will need 
to think of accounting. 

Accounting for networked access, for example, would include to have 
warnings sent out to hardcore users. If these are from non-provider sites, 
these might need to contribute funds, or get blocked after certain 
volumes (lets say, after 250 MBytes). You are right saying that 
there are a lot of GOPHER questions to be asked but some of our 
non-swiss FTP customers rolled out the EMBL CD en block. This is 
definitively not the intention of providing ftp servers! But also GOPHER 
starts to go bananas occasionally:  Now as I see that there are apparently
two sites who have set up automatic gopherized queries via whatever script
(same time of the day, same questions asked on a DAILY basis) you 
will understand that I start worrying as service provider... 

Last, let me renmind you that the Pisa networking conference starting 
next monday is made for such discussions, and I hope to see some of you there.

-- 
+----------------------------------+-------------------------------------+
|    Dr. Reinhard Doelz            | RFC     doelz at urz.unibas.ch         |
|      Biocomputing                | DECNET  20579::48130::doelz         |
|Biozentrum der Universitaet       | X25     022846211142036::doelz      |
|   Klingelbergstrasse 70          | FAX     x41 61 261- 6760 or 267- 2078     
|     CH 4056 Basel                | TEL     x41 61 267- 2076 or 2247    |   
+------------- bioftp.unibas.ch is the SWISS EMBnet node ----------------+
               -----------------------------------------




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net