Dear collegues,
A new version of the program EMBL2NBRF is available from the CAOS/CAMM
anonymous FTP-site (host: camms1.caos.kun.nl; dir: pub/molbio/embl2nbrf).
EMBL2NBRF reformats EMBL or Swiss-Prot flatfiles into NBRF-styled files,
i.e. SEQ, REF, and (optionally) TTL files. These files can be used with
the GCG package (only run dbindex on them), or with the NBRF-programs XQS,
PSQ, and NAQ.
The program compiles and runs on various flavors of UNIX and under VMS.
It only requires the use of an ANSI-C compiler. If your system is not
equipped with ANSI-C, you can use the GCC compiler.
The major new feature of the program is its ability to read data from
standard input (besides the already existing options to read flatfiles
directly, or via a file-list). This makes it no longer necessary to copy,
uncompress, or concatenate files before they can be processed. See below
for some examples.
EXAMPLES:
_1_
To reformat the new compressed Swiss-Prot file, store the data in NBRF-files
'swissprot.seq' and 'swissprot.ref' (-n flag), store the summary-report in
'swissprot.info' (-s flag), and monitor the progress (-m flag), type:
% zcat sprot27.dat.Z | embl2nbrf -n swissprot -s swissprot.info -m --
_2_
To reformat the primates-section on the EMBL CDROM, do
% tr -d '\015' < /cdrom/EMBL/PRI.DAT | embl2nbrf -n em_pr -s em_pr.info --
(the translate command 'tr' strips all <CR>'s from the flatfile on CDROM!)
_3_
The same, but now storing ALL EMBL flatfiles, AND the EMBL-updates in their
distribution directory (-d flag), using a file-list (-f flag), into one
NBRF-style database, called 'embl', converting all sequences to uppercase:
% cat /cdrom/EMBL/*DAT | tr -d '\015' | embl2nbrf -u -m \
-- -d /data/embnet/dna/data -f updates.lis
AVAILABLE OPTIONS:
Program Version: 1.9 (17-Nov-1993)
Syntax: embl2nbrf [-flags] [files] [ [-flags] [files] ]
Flags:
-- Read from standard input.
-a Append data to existing NBRF-formatted files.
-d DIR Read files (in a file-list) from directory DIR.
-f FOF Read filenames from file-list FOF.
-m Monitor mode on (Report every 1000 entries processed).
-n NAME Specify filename for output (default is "embl").
-s SUMM Save summary-report to file SUMM.
-t Create title-line file (TTL).
-u Convert sequences to uppercase. Default is to ignore case.
-v Verbose mode on (Report every entry processed).
--
+----------------------------+-----------------------------------+
Jack A.M. Leunissen, Ph.D. | CAOS/CAMM Center
Email: jackl at caos.kun.nl | University of Nijmegen
Tel. : +31 80 65 22 48 | Toernooiveld 1
Fax : +31 80 65 29 77 | 6525 ED Nijmegen, The Netherlands
+-------- CAOS/CAMM is the Dutch National Node in EMBnet --------+