Though biology is a 'small' branch in the 'big' informatics world, I would
like to develop it as effective as possible. There is the 'normal' gopher
list but I am not sure how this does fit into the development of biology-
specific applications in general. Therefore, I have created a mailing list
called biogopher at comp.bioz.unibas.ch, which shall serve as a discussion forum
in between developers for the gopher AND OTHER alternatives in the biocomputing
server world. Send me a mail to doelz at urz.unibas.ch and I'll put you on there.
This mailing list is available on the bioftp gopher, and is reindexed daily.
Also on the bioftp server, in programs/gopher-list.
###################################
#
Name=bioftp EMBnet Switzerland (experimental)
Type=1
Port=70
Path=
Host=bioftp.unibas.ch
#
###################################
Regards
Reinhard
How to add a new format to the indexing system for gophers.
===========================================================
0) Obtain Don'Gilberts modified software for biology
from ftp.bio.indiana.edu and implement it.
1) Go to the directory 'ir' of the wais system. Edit the file
ircfiles.c and look at the existing formats. Briefly, you need to
have
a) a description of the separators so that you know which items belong
to different formats
b) a list of items you don't want to have in the indexing procedure
c) a sufficiently representative example entry.
There are three (maybe four) functions to be edited.
separator_function: this returns 'true' if the separator condition as
defined in (a) match.
header_function: this fills the header (i.e., what you see displayed
as a hit later) with strings of relatively short
length
finish_header_function: this copies the header as assembled earlier
to the 'header' variable used in the indexing,
and truncates it if needed.
Don Gilbert has added another function called
filter_for_index: This filters any strings not desired in the indexing
such as the entry labels in the database etc.
It also permits to blank out any lines which
are not needed (sequence data, etc.)
2) Define a significant name for the new format. Look at the
existing formats with
% waisindex
without arguments.
3) Write the routines needed to describe the header, by taking one of
the examples already in the ircfiles.c file. Do not forget to describe
the database format properly. Call the functions you write
aaa_separator_function
aaa_header_function
aaa_finish_header_function
aaa_filter_for_index
(assuming that your format is called aaa and the filter is needed as well. )
4) Edit the file ircfiles.h and add the function definitions, e.g.,
/* aaa flat files -- rd*/
boolean aaa_separator_function _AP((char *line));
void aaa_header_function _AP((char *line));
void aaa_finish_header_function _AP((char *header));
5) Edit the file irbuild.c to incorporate a printf statement being displayed
if 'waisindex' is typed without a command.
Next, step down in the file and add the routines for the parser. E.g.,
else if(0 == strcmp("aaa", next_argument)){/* rd */
typename = next_argument;
type = "TEXT";
separator_function = aaa_separator_function;
header_function = aaa_header_function;
finish_header_function = aaa_finish_header_function;
}
6) Compile the program with 'make'.
If your 'waisindex' image resides somewhere else, copy the image to
there.
7) Try it with a new database. Go to the directory where you keep your
indices and create a new directory. Then, step to that directory
and make a small shell script to index the files with your format, e.g.
#!/bin/csh
#!/bin/csh
cd /mnt/gopher-data/index/aaa
waisindex -t aaa /mnt/gopher-data/data/aaa/bbb.dat
Next, make the directory for the data and copy (hard link does also work)
the data to there, and execute the script in the index directory.
8) Go to your public directory for the gopher and announce the index
in a .link file such as
Type=7
Host=xx.example.com
Port=456
Path=1/
Name=search for a key word
9) Back in the index directory, edit the two hostdata and listdata files
as the examples given by Don Gilbert, such as
hostdata:
xxx.examlpe.com
150
/mnt/gopher-data/data
and, listdata:
xxx.example.com
152
/usr/tmp/gopher
10) Last, edit /etc/rc.local or the corresponding script to start a new
gindexd at boot time.
DEBUGGING:
It is useful to insert printf statements in the finish_header_function.
One can type the file index.hl in the index directory.
netstat -a can tell you wether the index daemon started properly.
The gopher log file can tell more on opening errors of files.
Try it, good luck.
Reinhard
PS:
My changes are available on the bioftp server in the programs/gopher-list
hierarchy. As well on Gopher in 'About', then 'biogopher' .
************************************************************************
Dr. Reinhard Doelz * EAN doelz at urz.unibas.ch
Biocomputing * DECNET 20579::48130::doelz
Biozentrum der Universitaet * X25 psi%022846211142::embnet
Klingelbergstrasse 70 * FAX x41 61 261- 6760
CH 4056 Basel * TEL x41 61 267- 2076 or 2247
************************************************************************