Sniping on the new (was: what software do biologists need?)

S S Sturrock sss at castle.ed.ac.uk
Wed Mar 24 10:36:08 EST 1993

In article <1opd5pINN1s24 at rs1.rrz.Uni-Koeln.DE> khofmann at biomed.biolan.uni-koeln.de writes:
>- I disapprove shareware, at least in science. In my opinion, the typical
>  situation of a non-professional program author in science is that he/she
>  writes the particular program for his/her own needs and in many cases
>  is paid for it anyway by the employer. Why does the programmer try to
>  earn money with the product and not share it with his/her fellow researchers?

Well, let me put it this way.  Although I get paid to do research in the
techniques on parallel computers that pay is for the research stage.
Turning the code into something more usuable to the outside world requires
work outside of my 9-5 job since I am meant to be doing research not
dandying up a piece of functional code, I have been quite capable of
running my programs from the first incarnation but most people would throw
a fit trying to work through it, no documentation, obscure shortcuts and
commands, separate programs used in sequence to perform the task.  Fine for
me but I have had to plug away making it idiot proof (still hasn't worked
though since I noticed someone sending a pattern into MPsrch at EMBL the
other day, you know, something like SF[TA][HA]AA[AI] or the likes, while
this is a nice idea and I expect I will incorporate it later the code does
not do this just yet) and generally tailoring the code to work in the
various installations.

If anything, this sort of work is the really time consuming part since
everyone wants to have their say whereas before I could just plunge ahead
and do it the way I wanted and be damned the rest of the world.
>- more generally, I don't see the point in writing copyright-protected small
>  pieces of software in science, unless you are forced to do so by some
>  contract.
>  Putting a copyright on the program prevents others from modifying the 
>  program for their own needs. There is hardly any source code with copyrighted
>  software and what is so unfavourable in letting others improving your
>  programs?

Depends on what you consider small.  If your code is small because it is
compact and fast then copyright away.  Also we trademarked the name of the
code too just for good measure.

On the other hand, I wrote a library which does reconstructions of
alignments very fast in very little memory (has successfully aligned 50,000
bp with gaps in only 8 MB) and is used in just about all the programs we
have produced in the last year.  The library contains lots of useful
routines which means we can easily plug together a new searching program in
far less time than previously.  What will happen to that I don't know.
>- A major drawback in almost all PD/freeware/shareware programs dedicated to
>  SEQUENCE ANALYSIS is the restriction to a particular sequence format.
>  I thought that using Don Gilbert's READSEQ and similar routines, it should
>  be possible to write format-independent software, this would enlargen the
>  group of people able to apply the program substantially.
>  (since most of these programs have no sources available, it is not possible
>   to include routines for dealing with different sequence formats later)

What I have done is to allow for fairly free format around the FASTA style,
not hugely difficult to format a query to work.  The Blitz service takes
it's own format and converts this to mine and stuffs it into the program.
I expect that given a little impetus and time I will produce a code that
can read the various formats out there but for the moment FASTA format is
fine.  As for DBase format I prefer to leave that in it's native form as
seen on the CD-ROM, I don't see the point in creating a whole new set of
files which contain all the information in the original flat file so I
don't, I just use it as it comes.  Saves on disc space too since I can even
run directly from the CD-ROM albeit a little more slowly.
>- I think that far more software is written by researches all over the world
>  that could be used by others. Why are such relatively few programmers
>  willing to let others use their programs, too? 

It's not really quite like that.  Generally as I said quickie codes are
written in a slapdash manner, it takes time to make code acceptable for the
rest of the world.  Sometimes users will say that it would be nice to add a
tiny little feature which seems like no big deal (for instance the pattern
matching problem) but it may well require a heavy rewrite on the part of
the programmer.
>Phew, I feeling much better now.
>Any comments?

Hows that for starters?

Shane Sturrock, Biocomputing Research Unit, Darwin Building, Mayfield Road,
University of Edinburgh, Scotland, Commonwealth of Independent Kingdoms.  :-)

Civilisation is a Haggis Supper with salt and sauce and a bottle of Irn Bru.

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net