Announcement of the Protein Identification Resource
BITNET Network Request Service
Five commands and access to two new sequence databases have been added to the
network request service. The new commands are FEATURE, HOST, QUIT, SUPERFAMILY
and TAXONOMY. They are described below in more detail. The two new databases
are NRL_3D which contains the sequence information extracted from the
Brookhaven Protein Data Bank, and GBNEW which contains the weekly update
sequences from GenBank (TM). These databases are automatically available
through all the commands that can use them.
The National Biomedical Research Foundation Protein Identification Resource has
a full-function network fileserver and database query system. This automatic
network server, operating since August 1990, is capable of handling database
queries, sequence searches and sequence submissions, in addition to fileserver
requests. To use this server, request commands should be directed to
FILESERV at GUNBRF on BITNET. The FILESERVer recognizes the following commands
sent either in a mail message, or (if the sender is on BITNET) in command
messages or in a file:
Command Action
------- -----------------------------------------------
ACCESSION list entry codes and titles by accession number
AUTHOR list entry codes and titles by author
BASES list accessible databases
DEPOSIT deposit entry for database submission
END DEPOSIT terminate deposit entry
FEATURE list entry codes and titles by feature table entry
GET return entry by entry code
HELP return HELP instructions
HOST list entry codes and titles by host species
INDEX list SENDable files
JOURNAL list entry codes and titles by journal citation
KEYWORD list entry codes and titles by keyword
QUIT ignore the remaining text (E-mail signature blocks)
RETURN change return address for gateway mail
SEARCH search for sequence by FASTA procedure
END SEARCH terminate sequence for searching
SEND send file
SPECIES list entry codes and titles by species
SUGGEST leave suggestion or correction for PIR staff
END SUGGEST terminate suggestion text
SUPERFAMILY list entry codes and titles by superfamily name
TAXONOMY report taxonomy for scientific or common name
TITLE list entry codes and titles by title
Multiple commands can be sent with one command on each line of a mail message
or file. Commands should NOT be sent on the Subject line of a mail message.
Receipt of command messages and files will be acknowledged immediately. Mail
messages will be acknowledged by return mail.
For help in using any of the commands, send a request of the form
HELP topic
for example
HELP SEARCH
In addition to the commands, help instructions are also available on the
following topics:
Gateway_Access
IBM-VM_BITNET
On-Line_Access
VAX-VMS_BITNET
Because of inter-network gateway communication protocols, there are limitations
on requests sent through gateways. Users not on BITNET or INTERNET who will be
accessing BITNET through local or inter-network gateways should read and
carefully follow these instructions before sending requests. Only mail message
requests (not command messages or files) can be sent through gateways. Because
the addresses posted on gateway mail do not always work for the return, before
you send requests through inter-network gateways it is strongly recommended that
you first contact Dr. John S. Garavelli at POSTMASTER at GUNBRF on BITNET. We will
confirm a return address for you and may instruct you to use the RETURN command
to insure that your request output will reach you. It is not usually necessary
to do this if you are on BITNET or INTERNET, unless your system employs a local
remailer or your mail program applies a non-standard return address (for
example a personal name on the FROM: line).
The BITNET network and the inter-network gateways impose strict file size
limits. Poorly posed database queries may result in output so extensive that
it could not be returned by network mail. Therefore, an output limit of 1000
lines for each command and 3000 lines total for each request is imposed by the
PIR FILESERVer.
The DEPOSIT command must, and the SEARCH and SUGGEST commands may, be followed
by their respective END commands when text appears on intervening lines. The
DEPOSIT command requires, and the SEARCH command optionally uses, parameters
that appear on the same line as the command. Because of the complexity of
these commands, users should obtain and carefully read the help instructions on
these commands before attempting to use them.
Here is a brief synopsis of each server command.
ACCESSION number
This command will return a list of entry codes and titles for entries with
accession numbers matching the left portion of the accession number provided.
AUTHOR name
This command will return a list of entry codes and titles for entries with an
author matching the portion of the author name provided.
BASES
This command will return a list of the accessible databases and the number
of entries each contains. Currently, this selection of databases cannot
be changed during network access. The databases available and their
abbreviations for code specification are as follows:
Abbreviation Database Update Schedule
PIR1 PIR Annotated and Classified Entries quarterly
PIR2 PIR Preliminary Entries approximately bimonthly
PIR3 PIR Unverified Entries weekly
NRL_3D Brookhaven Data Bank Sequences quarterly
N NBRF Nucleic
GB GenBank (TM) as received
GBNEW GenBank (TM) New Entries weekly
EMBL EMBL as received
Access to these and additional databases can be provided to on-line users.
DEPOSIT FORM or DEPOSIT AUTHORIN
submission text
END DEPOSIT
This command will allow the submission of protein sequence entries prepared in
a standard format. The PIR accepts submissions in the electronic version of
the GenBank/EMBL/PIR Data Submission Form, or in the Transaction Protocol
Format of the GenBank AUTHORIN program. This command MUST be followed on the
same line by either FORM or AUTHORIN to indicate the type of deposit, and by
the END DEPOSIT command at the end of the text of the entry. Only one DEPOSIT
command should be sent with each request. A separate form must be submitted
for each sequence. Forms with more than one sequence and requests with more
than one DEPOSIT command cannot be accepted.
It is important that nucleotide sequences including authors' protein sequence
translations be submitted to only to GenBank or EMBL, as appropiate, and not to
the PIR FILESERVer. GenBank and EMBL forward protein sequences to the PIR
International with no further effort required on the part of the author.
FEATURE feature-name
This command will return a list of entry codes and titles for entries in the
PIR databases only with an entry in the feature table matching the portion of
the feature name provided. A list of the features currently in the database
can be obtained by the command SEND FEATURES.
GET code
This command will return the full text of an entry with the code matching the
code provided. These codes are found in the lists returned by one of the
search commands (ACCESSION, AUTHOR, JOURNAL, FEATURE, HOST, KEYWORD, SPECIES,
SUPERFAMILY or TITLE). The format of the code is a database abbreviation, a
colon, and four to ten alphanumeric characters.
HOST host-name
This command will return a list of entry codes and titles for entries in the
PIR databases only with a host name matching the portion of the host name
provided.
INDEX
This command will return a list of the files that can currently be sent by the
PIR FILESERVER using the SEND command.
JOURNAL citation
This command will return a list of entry codes and titles for entries with a
journal citation matching the portion of the citation provided.
KEYWORD words
This command will return a list of entry codes and titles for entries with
any keyword, or portion of a keyword, matching the words provided. You may
provide any number of groups of three or more alphanumeric characters
expected in a single keyword entry.
QUIT
If you use a mail program which automatically attaches a signature block to
every message, use this command to inform the server that all the following
lines should be ignored.
RETURN address
If you are sending mail from a non-BITNET network through a gateway, you may
need to provide a return address different from the one posted on the message
in order for your output to be sent to you correctly. The RETURN command
will allow you to correct your return address.
SEARCH parameters sequence
or
SEARCH parameters
sequence text
END SEARCH
This command will allow a sequence to be compared in a FASTA search
(see W.R.Pearson & D.J. Lipman PNAS (1988) 85:2444-2448) with the PIR
databases. You may send either protein or nucleotide sequences in the IUPAC
standard single letter code; however,only the PIR protein sequence databases
will be searched. Nucleotide sequences will be translated in six reading
frames according to a selectable genetic code, and those translated protein
sequences will be compared against the PIR protein sequence databases. The
SEARCH command may be used in two forms, either on a single line with
parameters and sequence, or on multiple lines with the parameters on the line
with the SEARCH command, followed by lines with the sequence and an END SEARCH
command on the line following the end of the sequence. There are two optional
parameters for the SEARCH command, KTUP and NUC. The KTUP parameter sets the
ktup value for the FASTA program. The NUC parameter specifies that the sequence
is a nucleotide sequence, and can select the genetic code to be used for the
translation of that sequence.
SEND filename
This command will instruct the FILESERVer to send, by separate electronic
transmission, the specified file. A list of the currently available files
can be obtained by using the INDEX command.
SPECIES name
This command will return a list of entry codes and titles for entries with
the species matching the portion of a species name provided. The species
name may be the Latin genus and/or species name, or a common name. Because
the names of some viruses contain the common name of the host species, entry
codes and titles for entries with the species of viruses infecting a species
may also be listed.
Please note: this is not an efficient command for performing a general query
of the PIR databases especially with extensively studied species. For well-
studied species, the TITLE command will be more efficient.
SUGGEST text
or
SUGGEST
text
END SUGGEST
This command will submit the text of your message to an NBRF staff member.
You may use it to suggest modifications or improvements to our FILESERVER
or corrections to the PIR database. You may either place the text on the
same line with the SUGGEST command, or you may use any number of lines for
the text followed by the END SUGGEST command on the line after the last line
of the text.
SUPERFAMILY superfamily-name
This command will return a list of entry codes and titles for entries in the
PIR databases only which belong to that superfamily. Since the domains of
some multidomain proteins are not completely classified, the SUPERFAMILY
command will not necessarily produce a complete list of all entries in a
specific superfamily. A list of the superfamilies currently in the database
can be obtained by the command SEND SUPERFAM.
TAXONOMY taxonomic-name
or
TAXONOMY common-name
This command will report taxonomies for entries in the taxonomic database
currently being used by the PIR and shared with GenBank (TM) and EMBL.
This database is maintained by Dr. Andrzej Elzanowski at the Max-Planck-
Institut fur Biochemie.
You should provide a fully or partially specified name of 1 to 8 words with
a minimum length of 3 letters each. Names at all taxonomic levels containing
those words will be reported. The organelles containing genetic material of
some higher organisms also have entries in this database.
TITLE title
This command will return a list of entry codes and titles for entries with
any portion of a title matching the word provided. You may provide any
number of groups of three or more alphanumeric characters expected in a
single title. PIR titles include protein names, species names and Enzyme
Commission numbers, consequently this command is generally the most efficient
way to perform a general query of the PIR databases.
------------------------------------------------------------------------
Dr. John S. Garavelli
Database Coordinator
Protein Identification Resource
National Biomedical Research Foundation
Washington, DC 20007
POSTMASTER at GUNBRF.BITNET