Karsten Quast in message <9211201109.AA11655 at net.bio.net> pleads
> I,m looking for a databank which contains protein secondary-structure data.
> I'd like to implement a neural network which predicts secondary-structure
> and need very much data for the training.
The PIR's NRL_3D database is an integrated database of protein sequence and
secondary structure information using all of the Brookhaven Protein Data Bank.
The caveat is that the annotated secondary structure information is only what
appears in the Brookhaven Protein Data Bank. Since that information was
provided by the depositors, it is not necessarily complete (not all sequences
are annotated for all features, so the absence of a feature for a particular
sequence can't be taken as meaning that structure is not present) or consistent
(some depositors may interpret what is essentially the same structure in
different ways.) The following is a description of the features annotations in
the NRL_3D database from one of our recent announcements. The current version
of NRL_3D is 10.00 it corresponds to Brookhaven Protein Data Bank Release 61,
and contains 1,457 sequences with 244,804 residues.
------------------------------------------------------------------------
2. NRL_3D Release 9.1 Has Feature Information from Brookhaven Data Bank
The NRL_3D Database of sequence information extracted from the Brookhaven
Protein Data Bank (PDB) has been upgraded to release 9.1. This new version
includes feature annotations extracted from PDB HELIX, SHEET, TURN, SITE, and
SSBOND records along with special ATOM and HETATM records. New algorithms
have been implemented to construct and name chains and fragments, to recognize
non-standard residues and to discard entries with completely unknown sequence.
NRL_3D release 9.1 corresponds to PDB release 60 (May 1992) and contains
1,380 sequences with 229,099 residues.
The inclusion of this feature information in NRL_3D allows PDB entries to be
recovered through the FEATURE command. For example the commands
USE BASES NRL_3D
FEATURE TURN "TYPE I "
will list all entries in the NRL_3D database with a "type I" turn annotated
in their corresponding PDB entry.
Release 9.1 of NRL_3D is available through the PIR Network Request Server,
through the PIR On-Line Access System and by FTP from the University of Houston
server at ftp.bchs.uh.edu in the files
/pub/gene-server/incoming/pir33/nrl_3d-9.1-vms
/pub/gene-server/incoming/pir33/nrl_3d-9.1-ascii
Our thanks to Bill Pearson and Dan Davison for their efforts in providing FTP
access to the PIR databases.
------------------------------------------------------------------------
Dr. John S. Garavelli
Database Coordinator
Protein Information Resource
National Biomedical Research Foundation
Washington, DC 20007
POSTMASTER at GUNBRF.BITNETPOSTMASTER at NBRF.GEORGETOWN.EDU