IUBio

Release 19 of TREMBL, a protein sequence database supplementing SWISS-PROT

Maria Jesus Martin martin at ebi.ac.uk
Mon Dec 17 06:05:55 EST 2001


INTRODUCTION
============

TrEMBL is a computer-annotated protein sequence database
supplementing the SWISS-PROT Protein Knowledgebase. TrEMBL
contains the translations of all coding sequences (CDS)
present in the EMBL Nucleotide Sequence Database not yet
integrated in SWISS-PROT. TrEMBL can be considered as a
preliminary section of SWISS-PROT. For all TrEMBL entries
which should finally be upgraded to the standard SWISS-PROT
quality, SWISS-PROT accession numbers have been assigned.


RELEASE 19.0 OF TrEMBL
=====================

This TrEMBL release was created from the EMBL Nucleotide
Sequence Database release 68 and updates until 16.11.01
and contains 636'825 entries and 184'332'036 amino acids.
To minimize redundancy, the translations of all coding
sequences (CDS) in the EMBL Nucleotide Sequence Database
already included in SWISS-PROT release 40 and updates
until 13.12.01 have been removed from TrEMBL release 19.

TrEMBL is split in two main sections: SP-TrEMBL and
REM-TrEMBL:
SP-TrEMBL (SWISS-PROT TrEMBL) contains the entries
(562'222) which should be eventually incorporated
into SWISS-PROT. SWISS-PROT accession numbers have
been assigned for all SP-TrEMBL entries.

SP-TrEMBL is organized in subsections:

arc.dat (Archaea):                    1692 entries
arp.dat (Complete Archaeal proteomes):19590 entries
fun.dat (Fungi):                      13339 entries
hum.dat (Human):                      29064 entries
inv.dat (Invertebrates):              60611 entries
mam.dat (Other Mammals):              9724 entries
mhc.dat (MHC proteins):               7434 entries
org.dat (Organelles):                 50792 entries
phg.dat (Bacteriophages):             4368 entries
pln.dat (Plants):                     58841 entries
pro.dat (Prokaryotes):                69426 entries
prp.dat (Complete Prokaryote Proteomes):74477 entries
rod.dat (Rodents):                    24185 entries
unc.dat (Unclassified):               252 entries
vrl.dat (Viruses):                    60309 entries
vrt.dat (Other Vertebrates):          11003 entries
vrv.dat (Retroviruses):               67115 entries

80'772 new entries have been integrated in SP-TrEMBL.
The sequences of 1388 SP-TrEMBL entries have been updated
and the annotation has been updated in 321'110 entries.

In the document deleteac.txt, you will find a list of
all accession numbers which were previously present in
TrEMBL, but which have now been deleted from
the database.

REM-TrEMBL (REMaining TrEMBL) contains the entries
(74'603) that we do not want to include in SWISS-PROT.

ACCESS/DATA DISTRIBUTION
========================

FTP server:     ftp.ebi.ac.uk/pub/databases/trembl
SRS server:     http://srs.ebi.ac.uk/

TrEMBL is also available on the SWISS-PROT CD-ROM.
SWISS-PROT + TrEMBL is searchable on the following
servers at the EBI:

FASTA3  (http://www.ebi.ac.uk/fasta33/)
BLAST2  (http://www.ebi.ac.uk/blast2/)
Bic_sw  (http://www.ebi.ac.uk/bic_sw/)
Scanps  (http://www.ebi.ac.uk/scanps/)
MPSrch  (http://www.ebi.ac.uk/MPsrch/)

TrEMBL HAS BEEN PREPARED BY:
============================

Rolf Apweiler, Kirsty Bates, Margaret Biswas,
Sergio Contrino, Daniel Barrell, Kirill Degtyarenko,
Wolfgang Fleischmann, Gill Fraser, Henning Hermjakob,
Kati Laiho, Alexander Kanapin, Youla Karavidopoulou,
Paul Kersey, Minna Lehvaslaiho, Michele Magrane,
Maria Jesus Martin, Virginie Mittard, Nicola Mulder,
Claire O'Donovan, John F. O'Rourke, Eleanor Whitfield
and Allyson Williams at the EMBL Outstation -
European Bioinformatics Institute (EBI) in Hinxton, UK;
Amos Bairoch, Isabelle Phan, Sandrine Pilbout and
Alain Gateau at the Swiss Institute of Bioinformatics
in Geneva, Switzerland.


-----------------------------------------------
Maria Jesus Martin                     email:martin at ebi.ac.uk
EMBL Outstation EBI
(European Bioinformatics Institute)    URL: http://www.ebi.ac.uk
Wellcome Trust Genome Campus           Tel: +44 (1223) 494408
Hinxton                                fax: +44 (1223) 494468
Cambridge
CB10 1SD UK





More information about the Proteins mailing list

Send comments to us at biosci-help [At] net.bio.net