IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP


Peter Rice pmr at staffa.sanger.ac.uk
Wed Oct 12 14:51:37 EST 1994

In article <1994Oct11.171249.1 at bobby.iaf.cnrs-gif.fr> bochet at bobby.iaf.cnrs-gif.fr writes:
>   I am sorry if this question has been asked already:
>   Should the current version of EGCG, the extensions for GCG written at  EMBL 
>   run with version 8 of GCG, even without the graphic interface WPI ? Or do we
>   have to await a new version of EGEG ?

Au contraire, but I have been expecting someone to ask. EGCG 8.0 is due
for release 'soon'. The current status report is attached.


EGCG Extended GCG, Release 8.0: Status Report      12 October 1994
=============================================      ===============

The EGCG package is a suite of programs which extend the standard
GCG (Wisconsin) Sequence Analysis package. The EGCG programs were
developed by Peter Rice (formerly EMBL, Germany and now at the
Sanger Centre, Hinxton, UK), Rodrigo Lopez (Biotechnology Centre
of Oslo), Jaakko Hattula (Tampere, Finland), Reinhard Doelz (Basel,
Switzerland) and Jack Leunissen (Nijmegen, Netherlands). The programs
are supported on VMS and Unix systems for GCG 7.3, and will soon be
released for GCG 8.0, also on VMS and Unix.

In the past, the programs have been included as unsupported software
in the GCG distribution, as well as availability by anonymous ftp.

Due to the many changes in GCG 8.0, we have preferred to wait for
the full GCG release before testing and releasing EGCG 8.0. Most
of the changes required for compatibility with GCG 8.0 (there were
many) have already been made during the GCG beta test period. In
the next few weeks we hope to complete building and testing the
programs on OpenVMS, Irix and OSF. Once testing is completed, we
can release EGCG 8.0 as a beta version by anonymous ftp.

Until then, sites can continue to run GCG 7.3 and EGCG to provide
users with the EGCG programs, while running GCG 8.0 for the new
GCG programs.

The initial EGCG 8.0 release will support only the command-line
interface. WPI will follow.

EGCG 8.0 will be identical for VMS and Unix. In the past we have
released separate versions for VMS and Unix, but now will integrate
everything. Unix is now the main operating system(s) for the EGCG
team, so VMS will in future be the secondary system, and all further
developments will be tested first on Unix.

EGCG 8.0 also features many extensions to the Procedure Library to
simplify the task of adding new applications and porting existing ones.
We expect this feature of EGCG to grow considerably in the future.
Documentation of the new EGCG Procedure Library will be completed after
the beta release.

All EGCG 8.0 programs can be run in batch mode - this is a direct benefit
of the modified procedure library.

The WPI interface will be added to EGCG only after the initial beta
release. It will take some time to generate the configuration files,
and we plan to modify WPI if necessary to make the EGCG programs
easier to use (this probably only means adding a top level menu item,
but it is too early to be sure how much effort is involved).

EGCG code remains in Fortran, although like GCG it will increasingly
use C in the future.

Highlights of EGCG 8.0 include:

New programs:

Prettybox has been submitted for inclusion by Rick Westerman. Prettybox
produces shaded alignments for PostScript printers.

About 20 programs omitted from the last Unix port (EGCG 7.1) will be
included this time or as soon as possible after. The full list of
programs is included below (see after **).

Some other programs have also been submitted, but will not be fully
integrated until the remainder of EGCG 8.0 is stable.

'Obsolete' programs:

GCG have decided to drop QuickSearch from the new release. The EGCG versions
of the Quick programs make far fewer demands on system resources, and were
run successfully as a service by EMBL for some years. One of the major
users of that service is a colleague at the Sanger Centre, so naturally
these programs will continue to be supported in EGCG. The emphasis will be
on supporting long but imperfect sequences from genome-scale projects,
so further improvements to the programs will be made over the next year.

We are open to requests to reimplement the DIE command too. We have found it
very useful when running training courses to get the students to logout and
go home ("when you've checked the FASTA results, why not try DIE and see
what happens ..."), and have also seen users with it as an alias for logout
so it may be missed after all.


Programs previously omitted from the Unix distribution of EGCG due to
problems in porting, or introduced to EGCG after the last Unix porting
effort, will be made available as soon as possible:

	AllTrans : Translates a set of aligned DNA sequences into aligned
	    protein sequences.

	BFastA : A version of FastA using the BLOSUM62 matrix.

	BTFastA : A version of TFastA using the BLOSUM62 matrix.

	Chaos : represents nucleotide sequences on a 4-vertices plane

	MapSelect : Selects restriction enzymes by name or by ability to cut
             a specified sequence, and creates a new ENZYME.DAT file for
	     use by other programs.

	NewFeatures : An interactive editor for entering and modifying the
             feature table, and for minor editing of the sequence itself.
	     Also able to understand most feature table syntax, including
	     joins across entries and additional qualifiers.
	     (WARNING: we expect some problems porting the library routines
	     for this program, so it may be delayed for Unix).

	NoReturn : Removes trailing carriage returns and line feeds.

	PepAllWindow : Plots hydrophobicity for one or more multiple
	    sequence alignments.

	PepCoil : Identifies potential coiled-coil regions in proteins.

	ToEmbl : Extracts an EMBL entry in EMBL format.

	ToGenBank : Extracts a GenBank entry in GenBank format.

	ToPirAll : Converts a set of sequences of subsequences into a single
	    file in PIR format.

	TProfileGap : ProfileGap with optional 6-frame translation of a
	    DNA sequence.

	TProfileSearch : ProfileSearch with ability to search any size of
	    database, and optional 6-frame translation of DNA databases.

	TProfileSegments : Processes the output file from TProfileSearch.

	TSegments : Processes TWordSearch output.

	TWordSearch : WordSearch with a 6-frame translation of the database.
	GelAnalyze : Reads the output of GelStatus, and produces project
            statistics for shotgun sequencing.

	NewQuickIndex : A much faster version of QuickIndex that produces the
	    index files for NewQuickSearch.

	NewQuickSearch : A much faster version of QuickSearch that can run
	    on almost all systems without a major virtual memory overhead.

	QuickMatch : Displays the overlaps found by NewQuickSearch (or by
	    QuickSearch), with selection for good quality matches.

	In EGCG 7.2, command line control was added to all the GCG programs
	that did not provide full support. This work was done by summer student
	Jaakko Hattula from Tampere University of Technology in Finland. GCG 8.0
	now includes full command line control, but where Jaakko solved the problem
	differently, we may retain our versions too.

	The programs (see the GCG manual for details) in EGCG 7.2 were:
	EAssemble, ECodonFrequency, ECompTable, EConsensus, ECorrespond,
	ECrypt, EDiverge, EExtractPeptide, EFingerPrint, EFromStaden,  
	EGetSeq, EPublish, ERepeat, EReverse, EStatPlot, ETerminator,  
	EToStaden, ETranslate, EWindow      


  Version  7.2  of  the  EGCG  Programs was  prepared by Peter Rice  (EMBL,
  Heidelberg, Germany), Rodrigo Lopez  (Biotechnology Centre of Oslo, 
  Norway), Jaakko Hattula (Tampere University of Technology, Finland),
  Reinhard  Doelz (Basel, Switzerland) and Jack Leunissen (CAOS/CAMM Centre,

  We are very grateful to (in alphabetical order) Rein Aasland, Wilhelm
  Ansorge, Peer Bork, Thure Etzold, Toby Gibson, Tom Kristensen, Franc 
  Pattus, Kate Rice, Christian Schwager, Peter Sibbald, Julie Thompson,
  Hartmut Voss and Gert Vriend for their many contributions and critical
  comments as users of the EGCG Programs.

  We are also deeply indebted to the staff of GCG Inc. who provided rapid
  and helpful answers to our many questions during the development of 
  the programs. Many thanks to Irv Edelman, Maggie Smith, Donald Katz, Michael
  Hogan, Joseph King, Mary Schulz and especially John Devereux.


  Peter Rice      pmr at sanger.ac.uk                Tel: +44 1223-494967
  Rodrigo Lopez   rodrigol at biotek.uio.no          Tel: +47 22958756

Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr    | England

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net