EGCG 9.0 and beyond

pmr at sanger.ac.uk pmr at sanger.ac.uk
Wed Mar 19 07:58:58 EST 1997

There has been a steady stream of questions about EGCG for GCG 9.0.
The situation is as follows.

The present

GCG 9.0, like any major release, has a number of major internal
changes that broke EGCG programs.  We were able to identify and fix
most of these during the GCG 9.0 beta test.

The full GCG 9.0 is now released. As everyone is aware, GCG 9.0 now
has a separate "software developers kit" which includes the source
code, which we absolutely need in order to build and test EGCG 9.0.

Some sites have asked whether the GCG "SDK" licence allows EGCG to be
distributed. The answer is "yes, but only as binaries". Also, this
needs some modifications to the standard SDK licence which are not yet

As soon as we have the GCG 9.0 source code, the schedule is:

o Build EGCG 9.0 with GCG 9.0, test all programs and update as needed
  (estimate: 1 month)

o (Assuming the licence permits) build an EGCG 9.0 beta release with
  source code, to be tested by selected sites who also have the GCG SDK.
  (estimate 2 months test period)

o Build binary releases of EGCG 9.0 (with the help of the EGCG beta
  test sites) for all platforms that are supported by GCG 9.0

Assuming the GCG SDK arrives in April, that makes July the earliest
date for a full binary release of EGCG 9.0.

Further binary releases will follow, probably at 2 monthly intervals,
with new applications and bugfixes.

Brief Release Notes

EGCG 9 will feature a number of new programs. Most will be in the initial
release. A few will be delayed until after the first release of EGCG 9
has been through testing.

o EPHYLIP : 11 progams from Joe Felsenstein's PHYLIP package
           (these are also available as a supplement to EGCG 8.1)
o EFASTA etc : a set of programs from Bill Pearson's fasta3 and fasta2
o FINDCLUSTER etc. : a set of programs to find and cluster similar
                     DNA sequences
o CLEANUP : Generates a non redundant sequence data library from a set
            of DNA sequences
o BANANA : Calculates DNA bending along a nucleotide sequence
o EST2GENOME : Aligns an EST or cDNA sequence to a genomic sequence
o MSU : Rainer Fuchs' integrated mail server request suite
o ORFEX : Searches for open reading frames on both strands
o SAD : Searches and destroys repeat sequences
o SIMPCR : Simulates PCR by matching primers to a sequence database
o TMAP : predicts membrane-spanning regions (Persson and Argos algorithm)

Future Plans

The future, once these applications are integrated into EGCG 9, is
rather different.

Making the EGCG source code available has been a major source of
useful suggestions and of new applications. Although we understand the
need for GCG's SDK licence, this does not fit with the aims of EGCG
developers or with the needs of some major EGCG user sites.

We also now have requests from a number of sites, including some of
our scientific collaborators, for versions of programs used in genome
analysis where the officially supported version is the one in EGCG,
but where these sites do not have GCG installed.

Also, for some years, we have been developing our own library routines
to extend and to supercede the functions in the GCG procedure

We have now reached the stage where it is better to use our own
library code, or other publicly available libraries, for functions
such as the user interface, sequence reading, data files and graphical
output. These are the major areas where the GCG libraries are still
used in EGCG.

The next major release of EGCG (EGCG 10.0) will be independent of GCG,
and will instead be integrated into a new package provisionally called
EMBOSS ("The European Molecular Biology Open Software Suite") which
will make its source code available.

EMBOSS will include many of the present EGCG 9 applications, plus some
other packages and new applications which will share the same libraries
and interfaces.

We intend that EMBOSS applications will continue to work together with
the GCG package. In some areas, it may be that they work better with
GCG than the current EGCG is able to. They will, however, be extended
in many ways that are simply not possible within the present design of

As part of the EMBOSS project, we have to resolve a number of issues
over those applications that are derived from GCG code. The most
useful of these, we hope, can be integrated into the GCG package or
else distributed in some other form. The others, those which we
believe are not being used, will be declared "obsolete" and no longer
supported. A list of obsolete applications will be prepared in time
for the EGCG 9 release, and these applications will still be included
in EGCG 9.

Some details of the EMBOSS project are not yet finalized. We hope that
these will become clearer in the coming months.

