Release 20 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at ebi.ac.uk
Tue Apr 2 21:16:57 EST 2002


TrEMBL is a computer-annotated protein sequence database
supplementing the SWISS-PROT Protein Knowledgebase. TrEMBL
contains the translations of all coding sequences (CDS)
present in the EMBL Nucleotide Sequence Database not yet
integrated in SWISS-PROT. TrEMBL can be considered as a
preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


This TrEMBL release was created from the EMBL Nucleotide
Sequence Database release 69 and updates until 08.02.02
and contains 700'753 entries and 203'489'769 amino acids.
To minimize redundancy, the translations of all coding
sequences (CDS) in the EMBL Nucleotide Sequence Database
already included in SWISS-PROT release 40 and updates
until 27.03.02 have been removed from TrEMBL release 20.

TrEMBL is split in two main sections: SP-TrEMBL and
entries (623'159) which should be eventually incorporated
into SWISS-PROT. SWISS-PROT accession numbers have
been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):                        1721 entries
arp.dat (Complete Archaeal proteomes):   22019 entries
fun.dat (Fungi):                         14172 entries
hum.dat (Human):                         29751 entries
inv.dat (Invertebrates):                 61859 entries
mam.dat (Other Mammals):                 10260 entries
mhc.dat (MHC proteins):                   7673 entries
org.dat (Organelles):                    55796 entries
phg.dat (Bacteriophages):                 4793 entries
pln.dat (Plants):                        61896 entries
pro.dat (Prokaryotes):                   71431 entries
prp.dat (Complete Prokaryote Proteomes):108287 entries
rod.dat (Rodents):                       25972 entries
unc.dat (Unclassified):                    143 entries
vrl.dat (Viruses):                       65583 entries
vrt.dat (Other Vertebrates):             11702 entries
vrv.dat (Retroviruses):                  70101 entries

65'161 new entries have been integrated in SP-TrEMBL.
The sequences of 840 SP-TrEMBL entries have been
updated and the annotation has been updated in
231'260 entries.

In the document deleteac.txt, you will find a list of
all accession numbers which were previously present in
TrEMBL, but which have now been deleted from the database.

REM-TrEMBL (REMaining TrEMBL) contains the entries
(77'594) that we do not want to include in SWISS-PROT.


FTP server:     ftp.ebi.ac.uk/pub/databases/trembl
SRS server:     http://srs.ebi.ac.uk/

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following
servers at the EBI:

FASTA3  (http://www.ebi.ac.uk/fasta33/)
BLAST2  (http://www.ebi.ac.uk/blast2/)
Bic_sw  (http://www.ebi.ac.uk/bic_sw/)
Scanps  (http://www.ebi.ac.uk/scanps/)
MPSrch  (http://www.ebi.ac.uk/MPsrch/)

For each TrEMBL release, a synchronized version
of the concurrent SWISS-PROT release is distributed
at ftp.ebi.ac.uk/pub/databases/trembl/swissprot/


Rolf Apweiler, Kirsty Bates, Margaret Biswas,
Sergio Contrino, Daniel Barrell, Kirill Degtyarenko,
Wolfgang Fleischmann, Gill Fraser, Henning Hermjakob,
Kati Laiho, Alexander Kanapin, Youla Karavidopoulou,
Paul Kersey, Minna Lehvaslaiho, Michele Magrane,
Maria Jesus Martin, Virginie Mittard, Nicola Mulder,
Claire O'Donovan, John F. O'Rourke, Eleanor Whitfield
and Allyson Williams at the EMBL Outstation -
European Bioinformatics Institute (EBI) in Hinxton, UK;
Amos Bairoch, Isabelle Phan, Sandrine Pilbout,
Alain Gateau and Alexandre Gattiker at the Swiss
Institute of Bioinformatics in Geneva, Switzerland.

Maria Jesus Martin                     email:martin at ebi.ac.uk
EMBL Outstation EBI
(European Bioinformatics Institute)    URL: http://www.ebi.ac.uk
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468

