Attention fellow GCG users:
ANNOUNCEMENT
EGCG is a package of 65 programs which extend the programs in the current
GCG package and add many entirely new functions.
Many of the programs were previously available as "GCGEMBL" in the
UNSUPPORTED.BCK saveset on the GCG distribution for VMS. As the programs
are no longer entirely written at EMBL, we have changed the name of the
package. The "E" stands for "Extended" GCG.
Porting of the EGCG programs from VAX/VMS to ALPHA OpenVMS was supported by
DEC under a University Porting Agreement.
The EGCG programs can be installed as a single copy on a mixed VAX/ALPHA
cluster. Separate directories are used for the system-specific files
(object files and image files).
DESCRIPTION
(1) 26 new programs in this VMS release
(* indicates programs included in the Unix release distributed by GCG
on CD-ROM with GCG version 7.2)
AllTrans : Translates a set of aligned DNA sequences into aligned
protein sequences.
* BasePairPlot : Plots the % occurrence and obs/expected frequency of
any dinucleotide pair i a sequence
BFastA : A version of FastA using the BLOSUM62 matrix.
BTFastA : A version of TFastA using the BLOSUM62 matrix.
* DbStats : Reports database statistics.
* GelFigure : Produces a graphical report of a contig in a Fragment
Assembly project, including restriction map, open reading
frames and fragment alignment.
* KabatToGcg : Converts the KABAT database to GCG format.
MapSelect : Selects restriction enzymes by name or by ability to cut
a specified sequence, and creates a new ENZYME.DAT file for
use by other programs.
* Melt : Calculates the melting temperature and %GC of a sequence.
* MeltPlot : Plots the melting curve for a nucleic acid sequence.
NewFeatures : An interactive editor for entering and modifying the
feature table, and for minor editing of the sequence itself.
Also able to understand most feature table syntax, including
joins across entries and additional qualifiers.
NoReturn : Removes trailing carriage returns and line feeds.
* Palindrome : Searches for perfect inverted repeats.
PepAllWindow : Plots hydrophobicity for one or more multiple
sequence alignments.
PepCoil : Identifies potential coiled-coil regions in proteins.
* PlotAlign : Plots conserved properties at each position in a multiple
sequence alignment.
* SeqDbToGcg : Converts the SEQDB database to GCG format.
ToEmbl : Extracts an EMBL entry in EMBL format.
ToGenBank : Extracts a GenBank entry in GenBank format.
ToPirAll : Converts a set of sequences of subsequences into a single
file in PIR format.
* ToText : Converts a sequence to plain text.
TProfileGap : ProfileGap with optional 6-frame translation of a
DNA sequence.
TProfileSearch : ProfileSearch with ability to search any size of
database, and optional 6-frame translation of DNA databases. [1,2]
TProfileSegments : [1,2] Processes the output file from TProfileSearch.
TSegments : Processes TWordSearch output.
TWordSearch : WordSearch with a 6-frame translation of the database.
(2) 19 GCG programs with command line control
Command line control has been added to all the GCG programs that did
not provide full support. This work was done by summer student
Jaakko Hattula from Tampere University of Technology in Finland.
The programs (see the GCG manual for details) are:
EAssemble, ECodonFrequency, ECompTable, EConsensus, ECorrespond,
ECrypt, EDiverge, EExtractPeptide, EFingerPrint, EFromStaden,
EGetSeq, EPublish, ERepeat, EReverse, EStatPlot, ETerminator,
EToStaden, ETranslate, EWindow
(3) 20 programs from the original GCGEMBL package, many now enhanced:
Antigenic : Reports potential antigenic regions.
CheckLen : Calculates checksums for entries in a database.
CheckLenComp : Compares CheckLen output for two databases, and reports
a list of unique entries.
CpGPlot : Plots the frequency of occurrence of CG dinucleotides
and percentage of C and G in a sequence.
FastACheck : Selects significant alignments from (T)Fasta output files.
GbOnly : Creates a list of GenBank entries that have accession numbers
not found in the latest EMBL release.
GelAnalyze : Reads the output of GelStatus, and produces project
statistics for shotgun sequencing.
GelPicture : Displays a diagram and printout of a contig from a
Fragment Assembly project, with ambiguities highlighted.
GelStatus : Reports progress of a Fragment Assembly project.
HelixTurnHelix : Predicts helix-turn-helix DNA binding domains.
NewQuickIndex : A much faster version of QuickIndex that produces the
index files for NewQuickSearch.
NewQuickSearch : A much faster version of QuickSearch that can run
on almost all systems without a major virtual memory overhead.
PepNet : Displays part of a protein as a helical net.
PepStats : Gives a short statistical summary on the composition
of a protein sequence, or a 3-franme translation.
PepWheel : Displays part of a protein as a helical wheel.
PepWindow : Plots hydrophobicity of a protein sequence.
PirOnly : Selects entries from PIR that are not in the latest
Swiss-Prot release.
PrettyPlot : Displays multiple sequence alignments with boxes around
conserved regions.
QuickMatch : Displays the overlaps found by NewQuickSearch (or by
QuickSearch), with selection for good quality matches.
SigCleave : Predicts signal peptide cleavage sites.
DISTRIBUTION
The programs are available from EMBL as follows:
(1) by anonymous FTP (binary mode) to ftp.embl-heidelberg.de
directory: /pub/software/vax/egcg
Files:
ecore.bck : command procedures and data
edoc.bck : documentation source
ehelp.bck : help files
esource.bck : full source code
000readme.txt : installation advice
fixrec.c : utility to make .bck files readable
fixrec.com : utility to make .bck files readable
whats_new.lis : release notes (empty initially)
EGCG is stored in 4 VMS backup savesets. The 000README.TXT file
explains how to fix the file format of these savasets.
Additional files:
fixrec.c C source to produce fixrec.exe which fixes
the record length of the save set if you have
any problems.
fixrec.com DCL script that does the same as the above.
(2) by E-mail from the EMBL Network File Server
Send E-mail to address NETSERV at EMBL-Heidelberg.DE with the
following message text:
HELP SOFTWARE
HELP VAX_SOFTWARE
GET VAX_SOFTWARE:EGCG.UAA
EGCG is provided in (at present) 46 separate files. They are unpacked
with UUDECODE and ZOO (the VAX_SOFTWARE file explains how) to give
the following files:
ecore.bck : command procedures and data
edoc.bck : documentation source
ehelp.bck : help files
esource.bck : full source code
000readme.txt : installation advice
fixrec.c : utility to make .bck files readable
fixrec.com : utility to make .bck files readable
whats_new.lis : release notes (empty initially)
EGCG is stored in 4 VMS backup savesets. The 000README.TXT file
explains how to fix the file format of these savasets.
Additional files:
file.exe utility to fix the record format of VMS BACKUP savesets
restored by ZOO
file.zoo ZOO archive of the original distribution of FILE
fixbck.com DCL procedure to run FILE
REFERENCES
[1] Gibson, TJ, et al. (1993) TIBS 18:331-333
[2] Musacchio, A, et al. (1993) TIBS 18:343-348
ACKNOWLEDGEMENTS
Version 7.2 of the EGCG Programs was prepared by Peter Rice (EMBL,
Heidelberg, Germany), Rodrigo Lopez (Biotechnology Centre of Oslo,
Norway), Jaakko Hattula (Tampere University of Technology, Finland),
Reinhard Doelz (Basel, Switzerland) and Jack Leunissen (CAOS/CAMM Centre,
Netherlands).
We are very grateful to (in alphabetical order) Rein Aasland, Wilhelm
Ansorge, Peer Bork, Thure Etzold, Toby Gibson, Tom Kristensen, Franc
Pattus, Kate Rice, Christian Schwager, Peter Sibbald, Julie Thompson,
Hartmut Voss and Gert Vriend for their many contributions and critical
comments as users of the EGCG Programs.
We are also deeply indebted to the staff of GCG Inc. who provided rapid
and helpful answers to our many questions during the development of
the programs. Many thanks to Irv Edelman, Maggie Smith, Donald Katz, Michael
Hogan, Joseph King, Mary Schulz and especially John Devereux.
CONTACTS
Peter Rice Peter.Rice at EMBL-Heidelberg.de Tel: +49 6221-387247
Rodrigo Lopez rodrigol at biotek.uio.no Tel: +47 22958756
-----------------------------------------------------------------------------
Peter Rice, EMBL | Post: Computer Group
| European Molecular
Internet: Peter.Rice at EMBL-Heidelberg.DE | Biology Laboratory
| Postfach 10-2209
Phone: +49-6221-387247 | 69012 Heidelberg
Fax: +49-6221-387306 | Germany