IUBio

PIR Network Request Service

POSTMAST at GUNBRF.BITNET POSTMAST at GUNBRF.BITNET
Wed Jul 1 12:53:00 EST 1992


Distribution-File:
        BIONEWS at genbank.bio.net,
        PROTEINS at genbank.bio.net

            Announcement of the Protein Identification Resource
                        Network Request Service

Two commands and access to one new database have been added to the PIR network
request service.  The new commands, CROSS and GENE, are described below in more
detail.  For reasons of program access, the GenBank database is broken into
three components, GB, GBSUP and GBNEW.  The GBNEW database contains the
GenBank weekly update entries.  GBSUP contains regular GenBank entries in
a supplemental database (presently these are the GenBank Primate entries)
and GB contains all the other regular GenBank entries.  All these GenBank
databases are automatically available through all the commands that can use
them.  Particular databases may be selected with the USE BASES command, and
the command
  USE BASES GB*
will select all the GenBank databases, and only those databases, for
subsequent database query and retrieval commands.

The National Biomedical Research Foundation Protein Identification Resource
network request service is a full-function fileserver and database query
system.  It has been operating since August 1990 and is capable of handling
database queries, sequence searches and sequence submissions, in addition to
fileserver requests.  To use this server, request commands should be sent to
FILESERV at GUNBRF on BITNET.  The FILESERVer recognizes the following commands
sent either in a mail message, or (if the sender is on BITNET) in command
messages or in a file:

  Command        Action
  -------        -----------------------------------------------
  ACCESSION      list entry codes and titles by accession number
  AND            combine QUERY commands with Boolean AND
  AUTHOR         list entry codes and titles by author
  BASES          list accessible databases
  CROSS          list PIR entry codes and titles corresponding to
                 a particular nucleotide sequence database entry
  DEPOSIT        deposit entry for database submission
    END DEPOSIT  terminate deposit entry
  FEATURE        list entry codes and titles by feature table entry
  GENE           list entry codes and titles for a gene name
  GET            return entry by entry code
  HELP           return HELP instructions
  HOST           list entry codes and titles by host species
  INDEX          list SENDable files
  JOURNAL        list entry codes and titles by journal citation
  KEYWORD        list entry codes and titles by keyword
  MEMBER         list alignments containing entry code as a member
  NOT            combine QUERY commands with Boolean NOT
  OR             combine QUERY commands with Boolean OR
  QUERY          begin collecting QUERY commands
    END QUERY    terminate collecting commands and execute QUERY
  QUIT           ignore the remaining text (E-mail signature blocks)
  RETURN         change return address for gateway mail
  SEARCH         search for sequence by FASTA procedure
    END SEARCH   terminate sequence for searching
  SEND           send file
  SPECIES        list entry codes and titles by species
  SUGGEST        leave suggestion or correction for PIR staff
    END SUGGEST  terminate suggestion text
  SUPERFAMILY    list entry codes and titles by superfamily name
  TAXONOMY       report taxonomy for scientific or common name
  TITLE          list entry codes and titles by title
  USE            set databases or dates to use in limited searches

Multiple commands can be sent with one command on each line of a mail message
or file.  Commands should NOT be sent on the Subject line of a mail message.
Receipt of command messages and files will be acknowledged immediately.  Mail
messages will be acknowledged by return mail.

For help in using any of the commands, send a request of the for
  HELP topic
for example
  HELP SEARCH

In addition to the commands, help instructions are also available on the
following topics:
  Custom_Services
  Databases
  Gateway_Access
  Help_en_Espanol
  Help_en_francais
  IBM-VM_BITNET
  On-Line_Access
  PIR_Distribution
  VAX-VMS_BITNET

Because of network gateway communication protocols, there are limitations
on requests sent through gateways.  Users not on BITNET or INTERNET who
access BITNET through local or network gateways should read and carefully
follow these instructions before sending requests.  Only mail message
requests (not command messages or files) can be sent through gateways.
Because addresses posted on gateway mail do not always work for the return,
before you send requests through network gateways it is strongly recommended
that you first contact Dr. John S. Garavelli at POSTMASTER at GUNBRF on BITNET.
We will confirm a return address for you and may instruct you to use the
RETURN command to insure that your request output will reach you.  It is not
usually necessary to do this if you are on BITNET or INTERNET, unless your
system employs a local remailer or your mail program applies a non-standard
return address (for example a personal name on the FROM: line).

The BITNET network and the network gateways impose strict limits on file size.
Poorly posed database queries may result in output so extensive that it could
not be returned by network mail.  Therefore, an output limit of 1000 lines for
each command and 3000 lines for each request is imposed by the PIR FILESERVer.

The DEPOSIT and QUERY commands must, and the SEARCH and SUGGEST commands may,
be followed by their respective END commands when text appears on intervening
lines.  The DEPOSIT command requires, and the SEARCH command optionally uses,
parameters that appear on the same line as the command.  Because these four
commands are so complex, users should obtain and carefully read the help
instructions before attempting to use them.

Here is a brief synopsis of each server command.

  ACCESSION number
This command will return a list of entry codes and titles for entries with
accession numbers matching the left portion of the accession number provided.

  AND
This command performs a Boolean AND operation in a QUERY, using the set of
entries collected by the preceding commands and selecting those that
additionally meet the condition specified in the next command.

  AUTHOR name
This command will return a list of entry codes and titles for entries with an
author matching the portion of the author name provided.

  BASES
This command will return a list of the accessible databases and the number of
entries each contains.  The databases available through the PIR Network Server
and their abbreviations for code specification are as follows:
  Abbreviation  Database                              Update Schedule
  PIR1          PIR Annotated and Classified Entries  quarterly
  PIR2          PIR Preliminary Entries               approximately bimonthly
  PIR3          PIR Unverified Entries                weekly
  ALN           PIR Alignment Entries                 quarterly
  NRL_3D        Brookhaven Data Bank Sequences        quarterly
  N             NBRF Nucleic
  GB            GenBank (TM)                          as received
  GBSUP         GenBank (TM)                          as received
  GBNEW         GenBank (TM) New Entries              weekly
  EMBL          EMBL                                  as received
Access to these and additional databases can be provided to on-line users.

  CROSS number
Use this command to find PIR entries that are the translation products of
nucleotide sequence database entries.  This command will return a list of
entry codes and titles for entries in the PIR databases only with a
cross-reference to the accession number provided in one of the nucleotide
sequence databases.

  DEPOSIT FORM or DEPOSIT AUTHORIN
    submission text
  END DEPOSIT
This command will allow the submission of protein sequence entries prepared in
a standard format.  The PIR accepts submissions in the electronic version of
the GenBank/EMBL/PIR Data Submission Form, or in the Transaction Protocol
Format of the GenBank AUTHORIN program.  This command MUST be followed on the
same line by either FORM or AUTHORIN to indicate the type of deposit, and by
the END DEPOSIT command at the end of the text of the entry.  Only one DEPOSIT
command should be sent with each request.  A separate form must be submitted
for each sequence.  Forms with more than one sequence and requests with more
than one DEPOSIT command cannot be accepted.
It is important that nucleotide sequences including authors' protein sequence
translations be submitted to only to GenBank or EMBL, as appropriate, and not
to the PIR FILESERVer.  GenBank and EMBL forward protein sequences to the PIR
International with no further effort required on the part of the author.

  FEATURE feature-name
This command will return a list of entry codes and titles for entries in the
PIR databases only with an entry in the feature table matching the portion of
the feature name provided.  A list of the features currently in the database
can be obtained by the command SEND FEATURES.

  GENE gene-name
This command will return a list of entry codes and titles for entries in the
PIR databases only with an entry in the gene name field matching a portion of
the gene name provided.  A minimum of 3 characters must be provided and case
is ignored.  Less than three characters can be supplied by enclosing three
characters including spaces within quotation marks.

  GET code
This command will return the full text of an entry with the code matching
the code provided.  These codes are found in the lists returned by one of
the query commands (ACCESSION, AUTHOR, JOURNAL, FEATURE, HOST, KEYWORD,
SPECIES, SUPERFAMILY or TITLE) or the MEMBER command.  The format of the
code is a database abbreviation, a colon, and four to ten alphanumeric
characters.  Inside a QUERY, a GET ALL command can be used to return all
the entries selected as a result of the query commands.

  HOST host-name
This command will return a list of entry codes and titles for entries in the
PIR databases only with a host name matching the portion of the host name
provided.

  INDEX
This command will return a list of the files that can currently be sent by the
PIR FILESERVER using the SEND command.

  JOURNAL citation
This command will return a list of entry codes and titles for entries with a
journal citation matching the portion of the citation provided.

  KEYWORD words
This command will return a list of entry codes and titles for entries with
any keyword, or portion of a keyword, matching the words provided.  You may
provide any number of groups of three or more alphanumeric characters
expected in a single keyword entry.

  MEMBER code
The MEMBER command searches the Alignment Database for any alignments that
contain the sequence entries with the corresponding code.

  NOT
This command performs a Boolean NOT operation in a QUERY, using the set of
entries collected by the preceding commands and removing from them the entries
that meet the condition specified in the next command.

  OR
This command performs a Boolean OR operation in a QUERY, using the set of
entries collected by the preceding commands and adding to them the set of
entries that meet the condition specified in the next command.

  QUERY
    commands
  END QUERY
This is a multi-line command to search for database entries that meet several
criteria simultaneously.  The commands between the QUERY command and the
END QUERY command are combined with Boolean operators to form a single database
query.  Any of the following commands can be used to form a query:  ACCESSION,
AUTHOR, FEATURE, HOST, JOURNAL, KEYWORD, SPECIES, SUPERFAMILY, TITLE.
Each command selects a set of entries from the available databases, then one of
the Boolean operations, AND, OR, NOT, is used to combine that set of entries
with the set selected by the next query command.  The USE command can be used
to limit the databases to be searched and the dates of the entries.
CAUTION --- poorly posed and inappropriately formed queries can easily select
a very large number of entries.  Please carefully read the help instructions
for this command before attempting to use it.

  QUIT
If you use a mail program which automatically attaches a signature block to
every message, use this command to inform the server that all the following
lines should be ignored.

  RETURN address
If you are sending mail from a non-BITNET network through a gateway, you may
need to provide a return address different from the one posted on the message
in order for your output to be sent to you correctly.   The RETURN command
will allow you to correct your return address.

  SEARCH parameters sequence
    or
  SEARCH parameters
    sequence text
  END SEARCH
This command will allow a sequence to be compared in a FASTA search
(see W.R.Pearson & D.J. Lipman PNAS (1988) 85:2444-2448) with the PIR
databases.  You may send either protein or nucleotide sequences in the IUPAC
standard single letter code; however,only the PIR protein sequence databases
will be searched.  Nucleotide sequences will be translated in six reading
frames according to a selectable genetic code, and those translated protein
sequences will be compared against the PIR protein sequence databases.  The
SEARCH command may be used in two forms, either on a single line with
parameters and sequence, or on multiple lines with the parameters on the line
with the SEARCH command, followed by lines with the sequence and an END SEARCH
command on the line following the end of the sequence.  There are two optional
parameters for the SEARCH command, KTUP and NUC.  The KTUP parameter sets the
ktup value for the FASTA program.  The NUC parameter specifies that the sequence
is a nucleotide sequence, and can select the genetic code to be used for the
translation of that sequence.

  SEND filename
This command will instruct the FILESERVer to send, by separate electronic
transmission, the specified file.  A list of the currently available files
can be obtained by using the INDEX command.

  SPECIES name
This command will return a list of entry codes and titles for entries with
the species matching the portion of a species name provided.  The species
name may be the Latin genus and/or species name, or a common name.  Because
the names of some viruses contain the common name of the host species, entry
codes and titles for entries with the species of viruses infecting a species
may also be listed.
Please note: this is not an efficient command for performing a general query
of the PIR databases especially with extensively studied species.  For well-
studied species, the TITLE command will be more efficient.

  SUGGEST text
     or
  SUGGEST
    text
  END SUGGEST
This command will submit the text of your message to an NBRF staff member.
You may use it to suggest modifications or improvements to our FILESERVER
or corrections to the PIR database.  You may either place the text on the
same line with the SUGGEST command, or you may use any number of lines for
the text followed by the END SUGGEST command on the line after the last line
of the text.

  SUPERFAMILY superfamily-name
This command will return a list of entry codes and titles for entries in the
PIR databases only which belong to that superfamily.  Since the domains of
some multidomain proteins are not completely classified, the SUPERFAMILY
command will not necessarily produce a complete list of all entries in a
specific superfamily.  A list of the superfamilies currently in the database
can be obtained by the command SEND SUPERFAM.

  TAXONOMY taxonomic-name
    or
  TAXONOMY common-name
This command will report taxonomies for entries in the taxonomic database
currently being used by the PIR and shared with GenBank (TM) and EMBL.
This database is maintained by Dr. Andrzej Elzanowski at the Max-Planck-
Institut fur Biochemie.
You should provide a fully or partially specified name of 1 to 8 words with
a minimum length of 3 letters each.  Names at all taxonomic levels containing
those words will be reported.  The organelles containing genetic material of
some higher organisms also have entries in this database.

  TITLE title
This command will return a list of entry codes and titles for entries with
any portion of a title matching the word provided.  You may provide any
number of groups of three or more alphanumeric characters expected in a
single title.  PIR titles include protein names, species names and Enzyme
Commission numbers, consequently this command is generally the most efficient
way to perform a general query of the PIR databases.

  USE
The USE command is used to select particular databases or dates to be used in
limited searches.  Three parameters may be set, the BEFORE date, the AFTER
date and the BASES database list.  The corresponding commands are
     USE BEFORE date
     USE AFTER  date
     USE BASES  database [ + database...]
Dates must be in the form, YYMMDD, where YY represents the last two digits
of the year, MM represents two digits of the month (with a leading zero, if
necessary), and DD represents the two digits of the day of the month (with a
leading zero, if necessary).
For the USE BASES command the set of databases to be used must be entered
as abbreviations on a single line connected by plus signs, "+".
All PIR databases can be used with "PIR*", all GenBank databases can be used
with "GB*", and all databases can be used with "*".
Not all commands work with all databases; please read the information returned
by the command HELP DATABASES.

------------------------------------------------------------------------
                                 Dr. John S. Garavelli
                                 Database Coordinator
                                 Protein Identification Resource
                                 National Biomedical Research Foundation
                                 Washington, DC  20007
                                 POSTMASTER at GUNBRF.BITNET



More information about the Proteins mailing list

Send comments to us at biosci-help [At] net.bio.net