bio gopher development mailing list

doelz at urz.unibas.ch doelz at urz.unibas.ch
Thu Feb 27 16:18:16 EST 1992

Though biology is a 'small' branch in the 'big' informatics world, I would 
like to develop it as effective as possible. There is the 'normal' gopher
list but I am not sure how this does fit into the development of biology-
specific applications in general. Therefore, I have created a mailing list 
called biogopher at comp.bioz.unibas.ch, which shall serve as a discussion forum 
in between developers for the gopher AND OTHER alternatives in the biocomputing
server world. Send me a mail to doelz at urz.unibas.ch and I'll put you on there.

This mailing list is available on the bioftp gopher, and is reindexed daily. 
Also on the bioftp server, in programs/gopher-list.

Name=bioftp EMBnet Switzerland  (experimental)



How to add a new format to the indexing system for gophers. 

0) Obtain Don'Gilberts modified software for biology 
from ftp.bio.indiana.edu and implement it.

1) Go to the directory 'ir' of the wais system. Edit the file 
ircfiles.c and look at the existing formats. Briefly, you need to 

a) a description of the separators so that you know which items belong 
   to different formats 
b) a list of items you don't want to have in the indexing procedure 
c) a sufficiently representative example entry. 

There are three (maybe four) functions to be edited. 

separator_function: this returns 'true' if the separator condition as 
                       defined in (a) match. 
header_function:    this fills the header (i.e., what you see displayed 
                       as a hit later) with strings of relatively short 
finish_header_function: this copies the header as assembled earlier 
                       to the 'header' variable used in the indexing, 
                       and truncates it if needed. 

Don Gilbert has added another function called 

filter_for_index:   This filters any strings not desired in the indexing 
                       such as the entry labels in the database etc. 
                       It also permits to blank out any lines which 
                       are not needed (sequence data, etc.) 

2) Define a significant name for the new format. Look at the 
existing formats with 

% waisindex 

without arguments. 

3) Write the routines needed to describe the header, by taking one of 
the examples already in the ircfiles.c file. Do not forget to describe 
the database format properly. Call the functions you write 


(assuming that your format is called aaa and the filter is needed as well. )

4) Edit the file ircfiles.h and add the function definitions, e.g., 

/* aaa   flat files -- rd*/
boolean aaa_separator_function _AP((char *line));
void aaa_header_function _AP((char *line));
void aaa_finish_header_function _AP((char *header));

5) Edit the file irbuild.c to incorporate a printf statement being displayed 
if 'waisindex' is typed without a command. 

Next, step down in the file and add the routines for the parser. E.g., 
      else if(0 == strcmp("aaa", next_argument)){/* rd */
        typename = next_argument;
        type = "TEXT";
        separator_function = aaa_separator_function;
        header_function = aaa_header_function;
        finish_header_function = aaa_finish_header_function;

6) Compile the program with 'make'. 
If your 'waisindex' image resides somewhere else, copy the image to 

7) Try it with a new database. Go to the directory where you keep your 
indices and create a new directory. Then, step to that directory 
and make a small shell script to index the files with your format, e.g. 
cd /mnt/gopher-data/index/aaa
waisindex  -t aaa /mnt/gopher-data/data/aaa/bbb.dat  

Next, make the directory for the data and copy (hard link does also work) 
the data to there, and execute the script in the index directory. 

8) Go to your public directory for the gopher and announce the index 
in a .link file such as 
Name=search for a key word        

9) Back in the index directory, edit the two hostdata and listdata files 
as the examples given by Don Gilbert, such as 

and, listdata: 

10) Last, edit /etc/rc.local or the corresponding script to start a new 
gindexd at boot time. 

It is useful to insert printf statements in the finish_header_function.
One can type the file index.hl in the index directory. 
netstat -a can tell you wether the index daemon started properly.
The gopher log file can tell more on opening errors of files. 

Try it, good luck. 

My changes are available on the bioftp server in the programs/gopher-list
hierarchy. As well on Gopher in 'About', then 'biogopher' . 

    Dr. Reinhard Doelz            *     EAN     doelz at urz.unibas.ch
      Biocomputing                *     DECNET  20579::48130::doelz
Biozentrum der Universitaet       *     X25     psi%022846211142::embnet
   Klingelbergstrasse 70          *     FAX     x41 61 261- 6760 
     CH 4056 Basel                *     TEL     x41 61 267- 2076 or 2247   


More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net