dbEST - database for "expressed sequence tags"
Summary - January 21, 1994
This is a regular announcement to indicate the status of dbEST at
GenBank, National Center for Biotechnology Information (NCBI),
National Library of Medicine, National Institutes of Health.
dbEST is a new resource (Nature Genetics 4:332-333; 1993) that contains
data from labortories generating incomplete "single-pass" cDNA sequences
(ESTs or "Expressed Sequence Tags," also known as "Transcribed Sequence
Fragments" and "Putatively Transcribed Partial Sequences").
Although dbEST sequences are incorporated into the new EST Division of
GenBank (Nucl. Acids Res. 21:2963-2965; 1993), annotation in dbEST is
more comprehensive and includes detailed contact information about the
contributors, genetic map locations (when available), and instructions
on obtaining physical DNA clones from the American Type Culture Collection
and other sources. In addition, NCBI periodically updates putative
homology assignments using the BLAST family of programs after "filtering"
the ESTs to mask vector contamination, repetitive elements and low
complexity subsequences in the conceptual translations.
dbEST data is available in a variety of forms, described below.
Information on the current release is as follows:
Database version number: 1.46
Number of Entries: 31,818
Summary by Organism
Homo sapiens (human): 16933
Caenorhabditis elegans (nematode): 4699
Arabidopsis thaliana (thale-cress): 4512*
Oryza sativa (rice): 4231*
Plasmodium falciparum (malaria): 831
Zea mays (maize): 267*
Mus musculus+domesticus (mouse): 150
Capra hircus (goat): 108
Pyrococcus furiosus: 50
Macropus eugenii (marsupial): 36
Gallus gallus (chicken): 1
* Note that we recently have received a large number of ESTs from
several plant species.
ACCESS TO EST DATA
1) The nucleotide sequences may be searched using the BLAST electronic
mail server. For more information send an e-mail message with the
word "help" in the body of the message to blast at ncbi.nlm.nih.gov.
The TBLASTN program wich takes an amino acid query sequence and
compares it with six-frame translations of dbEST DNA sequences is
2) Full reports on ESTs, including homology data, can be retrieved from
the dbEST electronic mail server. For more information send an
e-mail message with the word "help" in the body of the message to
est_report at ncbi.nlm.nih.gov
3) EST sequences are included in the new EST division of GenBank (R)
available from NCBI on CD-ROMs and by anonymous ftp. Individual records
may be retrieved using the RETRIEVE electronic mail server. For more
information send an e-mail message with the word "help" in the body of
the message to retrieve at ncbi.nlm.nih.gov
4) EST sequences are also available as a flat file in the FASTA format by
anonymous FTP in the /repository/dbEST directory at ncbi.nlm.nih.gov
5) We are also planning for Gopher amd WWW access to EST information. See
future postings of this announcement and "NCBI News." (For a free
subscription, send a request along with your name and postal mailing
address to: info at ncbi.nlm.nih.gov)
National Center for Biotechnology Information
National Library of Medicine,
National Institutes of Health
Bethesda, MD, 20894, USA
telephone: (301) 496-2475
fax: (301) 480-9241
e-mail: info at ncbi.nlm.nih.gov
WWW URL: http://www.ncbi.nlm.nih.gov