Containing 800,917 non-redundant protein sequences
The Protein Information Resource (PIR) is pleased to announce the
beta-release of the
PIR-NREF (Non-redundant REFerence) Protein Database at:
The PIR-NREF is designed to provide a timely and comprehensive
collection of all protein
sequence data, keeping pace with the genome sequencing projects and
attribution and minimal redundancy. The database contains all sequences
Swiss-Prot, TrEMBL, RefSeq, GenPept, and PDB, and is updated biweekly.
achieved based on clustering by sequence identity and taxonomy at the
species level. The
NREF report provides source attribution with protein IDs and names from
databases, in addition to protein sequence, taxonomy, and bibliography.
The web site supports direct retrieval of NREF reports based on sequence
identifiers, as well as full-scale BLAST search and peptide/pattern
match for functional
identification of query proteins or peptides. The results are linked to
databases for retrieval of up-to-date source entries.
An example NREF entry is at:
An example BLAST search output is at:
The database is downloadable in XML format (data file) and FASTA format
from our FTP site at:
Please visit the pages and give us your feedback!
The work is supported in part by NIH Grant# P41 LM05798.
Please contact Cathy Wu at wuc at nbrf.georgetown.edu for any comments and
for inquiries regarding setting up reciprocal links or mirror site