In message <9208271604.AA10206 at genbank.bio.net> ODONNELL at ARCB.AFRC.AC.UK
(Cary O'Donnell) expressed concern about an apparent "duplicated use of the
sequence ID for two very similar (almost identical!!) sequences."
Several people brought to our attention a problem concerning duplicated entry
identification codes and accession numbers among the data sections PIR1, PIR2,
and PIR3 in the PIR-International Protein Sequence Database. We apologize for
this difficulty and have modified our procedures to ensure that this does
not recur in future releases. We thank those who brought this problem to our
attention and will greatly appreciate any further comments, corrections, or
recommendations concerning the database.
We will to take this opportunity to restate the policy concerning entry
identification codes and accession numbers in order to clarify (we hope)
the situation.
The entry identification code (on the `header-line' in NBRF-format; on the
`ENTRY' record in CODATA format) is a unique code assigned to every entry in
PIR1, PIR2, and PIR3. The code should be unique across all three data
sections. The code is not a permanent identifier, however; it is subject to
change from release to release. The duplication of entry identification codes
reported in version 33 was a mistake and has been corrected.
An accession number as it appears within the reference section of an entry
refers uniquely to the sequence as reported by the authors in the corresponding
publication, manuscript, or submission. These sequences are being compiled
into an archival data set. The accession number is the entry identification
code of the archival sequence entry. fThese accession numbers are permanent
identifiers of the `reported' sequences and will remain associated with the
reported sequences as long as they remains in the database.
When the data are processed by PIR-International staff and entered in the PIR1
and PIR2 data sections, the accession numbers are placed in the accession field
of the appropriate reference. In NBRF format, they occur on `A;Accession:'
lines following the corresponding reference. In CODATA format they occur
within the `REFERENCE #accession' fields. Please refer to the document CXFSD
available from FILESERV at GUNBRF.BITNET (SEND CXFSD) for specifics concerning the
CODATA format.
Note that the reference-specific accession numbers are distinct from those that
occur on the `C;Accession:' line (NBRF format) or on the ACCESSION record
(CODATA format). This field contains a list of all the accession numbers that
were ever associated with the entry; some of these do not correspond to
specific reported sequences because our original policy was to associate them
with the entire `merged' PIR entry.
The PIR3 section of the database consists of all entries in the archival data
set that have not been entered into PIR1 and PIR2. These entries have not been
`merged' and the entry identification code and the accession number are
identical. There should not be any case where accession numbers found in PIR1
and PIR2 overlap with those in PIR3. However, there may be overlap among
accession numbers found within the PIR1 and PIR2 sections.
The entry in question A26616 will be merged with JS0468 in the next release.
A copy of the current version of that merged entry, which clearly presents the
origin of the sequence difference, is appended below.
------------------------------------------------------------------------
Dr. David G. George
Dr. John S. Garavelli
Protein Identification Resource
National Biomedical Research Foundation
Washington, DC 20007
POSTMASTER at GUNBRF.BITNET
------------------------------------------------------------------------
\\\
ENTRY JS0468 #Type Protein
TITLE Cytochrome-b5 reductase, placental - Human
#EC-number 1.6.2.2
DATE 17-Jul-1992 #Sequence 17-Jul-1992 #Text 17-Jul-1992
PLACEMENT 0.0 0.0 0.0 0.0 0.0
SOURCE Homo sapiens #Common-name man
ACCESSION JS0468\ A26616\ PX0015
REFERENCE
#Authors Tomatsu S., Kobayashi Y., Fukumaki Y., Yubisui T.,
Orii T., Sakaki Y.
#Journal Gene (1989) 80:353-361
#Title The organization and the complete nucleotide
sequence of the human NADH-cytochrome b5 reductase
gene.
#Reference-number JS0468
#Accession JS0468
#Molecule-type DNA
#Residues 1-301 <TOM>
#Cross-reference GB:M28705
#Comment The authors translated the codon CCA for residue 66
as Ser.
REFERENCE
#Authors Yubisui T., Naitoh Y., Zenno S., Tamura M.,
Takeshita M., Sakaki Y.
#Journal Proc. Natl. Acad. Sci. U.S.A. (1987) 84:3609-3613
#Title Molecular cloning of cDNAs of human liver and
placenta NADH-cytochrome b-5 reductase.
#Reference-number A94154
#Accession A26616
#Molecule-type mRNA
#Residues 8-65,'S',67-240 <YUB>
REFERENCE
#Authors Murakami K., Yubisui T., Takeshita M., Miyata T.
#Journal J. Biochem. (1989) 105:312-317
#Title The NH2-terminal structures of human and rat liver
microsomal NADH-cytochrome b5 reductases.
#Reference-number PX0016
#Accession PX0015
#Molecule-type protein
#Residues 2-25 <MUR>
KEYWORDS oxidoreductase
SUMMARY #Molecular-weight 34245 #Length 301 #Checksum 370
SEQUENCE
5 10 15 20 25 30
1 M G A Q L S T L G H M V L F P V W F L Y S L L M K L F Q R S
31 T P A I T L E S P D I K Y P L R L I D R E I I S H D T R R F
61 R F A L P P P Q H I L G L P V G Q H I Y L S A R I D G N L V
91 V R P Y T P I S S D D D K G F V D L V I K V Y F K D T H P K
121 F P A G G K M S Q Y L E S M Q I G D T I E F R G P S G L L V
151 Y Q G K G K F A I R P D K K S N P I I R T V K S V G M I A G
181 G T G I T P M L Q V I R A I M K D P D D H T V C H L L F A N
211 Q T E K D I L L R P E L E E L R N K H S A R F K L W Y T L D
241 R A P E A W D Y G Q G F V N E E M I R D H L P P P E E E P L
271 V L M C G P P P M I Q Y A C L P N L D H V G H P T E R C F V
301 F
///
\\\