NCBI Data Repository CD-ROM

Scott Federhen federhen at wisp.nlm.nih.gov
Thu Jun 11 10:46:22 EST 1992

The first release of the NCBI Data Repository CD-ROM is now available.
The NCBI Data Repository was established as a service for providing a
public distribution site for databases maintained by individual developers
or groups. The databases and software are not officially supported or 
maintained by NCBI nor does NCBI assume responsibility for the accuracy
or reliability of the data or software. Each data collection is solely 
the responsibility of the individual developer and the data is made 
available by NCBI 'as is'.

The Data Repository CD-ROMs are currently being distributed at no charge
on an experimental basis; a subscription service may be set up for future
releases. New releases are planned for every six months, with the next
release scheduled for October, 1992. The frequency of releases may be 
increased depending on the demand for the CD-ROMs and the updating 
frequency of the individual databases.

The Data Repository is also accessible over the Internet by anonymous
FTP to 'ncbi.nlm.nih.gov' ( Under the directory 'repository',
each collection of data is stored in individual subdirectories and is
accompanied by README files for file descriptions and the names of 

Questions, suggestions, requests for copies of the CD-ROM, and
proposals for additions to the repository should be addressed to
'repository at ncbi.nlm.nih.gov', or:

			NCBI Data Repository
			National Library of Medicine
			Bldg. 38A, Rm 8N-803
			Bethesda, MD 20894
			Phone:  (301) 496-2475

Scott Federhen
Manager, NCBI Data Repository


tfd -  Transcription Factor Database.  A relational database of transcription
       factors maintained by David Ghosh (ghosh at ncbi.nlm.nih.gov), NCBI.
       Last update: Mar. 10, 1992.

ngdd - Normalized Gene Designation Database.  Normalized gene maps for E.coli,
       Salmonella, Bacillus Subtilus, Pseudomonas aeruginosa, and Caulobacter
       crescentus from Yvon Abel and Robert Cedergen, University of Montreal.
       Last update: Jun. 25, 1990.

epd -  Eukaryotic Promoter Database. A collection of biologically functional,
       experimentally defined RNA POL II promoters active in higher eukaryotes.
       Maintained by Philipp Bucher (Philipp.Bucher at Isrec.Arcom.ch).
       Last update: Apr. 3, 1992.

limb - LIsting of Molecular Biology databases. A collection of information
       about the content and maintenance of a large number of databases of
       interest to the molecular biology community. Maintained by
       Graham Redgrave (gwr at life.lanl.gov), Los Alamos National Laboratory.
       Last update: Mar. 26, 1991.

compound - A knowledge base of compounds involved in intermediate 
       metabolism. Maintained by Peter Karp, SRI. (pkarp at ai.sri.com)
       Last update: Jan. 29, 1992.

metproto - A database of metabolic reactions, and associated DOD software.
       Maintained by Ray Ochs, Kansas State University. (rso2 at po.cwru.edu)
       Last update: Apr. 24, 1992.

rebase - Restriction Enzyme Database. A collection of information about
       restriction enzymes, their cutting sites and commercial sources.
       Maintained by Richard Roberts, Cold Spring Harbor Laboratory.
       (roberts at cshl.org)
       Last update: Mar. 16, 1992.

prosite - An annotated database of protein sequence motifs. Maintained by
       Amos Bairoch, University of Geneva. (bairoch at cmu.unige.ch)
       Last update: Mar. 13, 1992.

enzyme - The Enzyme Data Bank, a database of information about enzymes,
       including names, catalytic activity, cofactors, and pointers to
       relevant entries in sequence databases. This directory also includes
       an ASN.1 encoding of the database. Maintained by Amos Bairoch,
       University of Geneva. (bairoch at cmu.unige.ch)
       Last update: Mar. 13, 1992.

eco -  An E. coli genomic database. This directory includes DOS and Mac
       software. Maintained by Kenn Rudd, NCBI. (rudd at ncbi.nlm.nih.gov)
       Last update: Jan. 7, 1992.

flybase - The Drosophila Genetic Database, the genomic database for the
       fruit fly Drosophila melanogaster. Maintained by Michael Ashburner,
       (ma11 at phx.cam.ac.uk)
       Last update: Mar. 9, 1992.

acedb - A C. elgans Database, the genomic database for the nematode
       Caenorhabditis elegans. This directory includes software and 
       an installation script for running the system on several hardware
       platforms, including SPARCstations, DECstations, and SGIs.
       Maintained by Richard Durbin (rd at cele.mrc-lmb.cam.ac.uk) 
       and Jean Thierry-Mieg (mieg at frmop11.bitnet)
       Last update: Apr. 24, 1992.

kabat - A collection of sequences of immunological importance, including
       protein and nucleic acid sequences and alignments. Compiled by
       Elvin Kabat (kabat at ncbi.nlm.nih.gov). Maintained by Harold Perry
       (hperry at bbn.com)
       Last update: Mar. 9, 1992.

aids-db - A collection of sequences related to the HIV family of viruses.
       Gerry Myers, LANL. (glm at life.lanl.gov)
       Kersti MacInnes, LANL. (kam at life.lanl.gov)
       Last update: Apr. 22, 1992.

carbbank - A PC-based database and software system which contains 
       information about the structure of complex carbohydrates. This
       includes the Complex Carbohydrate Structure Database (CCSD) and
       the CarbBank software system.
       Maintained by Dana Smith, Scott Doubet and Peter Albersheim.
       (CarbBank at UGA.bitnet) or (76424.1122 at compuserve.com)
       Last update: Mar. 9, 1992.

blocks - A database of protein sequence homology blocks, constructed
       from SwissProt and PROSITE. Includes unix and dos software
       packages used to make the database. Maintained by Steven and
       Jorga Henikoff. (henikoff at sparky.fhcrc.org)
       Last update: Feb. 28, 1992.

t4phage - A genomic database for the T4 phage. Maintained by Elizabeth
       Kutter, University of Washington (t4phage at u.washington.edu) and
       David Batts, Evergreen State College (t4 at milton.u.washington.edu)
       Last update: Mar. 25, 1992.

eco2dbase - The E. coli gene-protein database, which links information
       about E. coli genes and their protein spots on 2-D gels.
       Maintained by Frederick C. Neidhart, University of Michigan.
       Last update: Mar. 10, 1992.

pkinases - A non-redundant annotated collection of protein kinase
       sequences. Maintained by Anne Marie Quinn, Salk Institute.
       (quinn at salk-sc2.sdsc.edu)

rldb - The Reference Library DataBase, a collection of information 
       about the chromosomal locations of a set of publicly available
       DNA probes. Maintained by Guenther Zehetner, Imperial Cancer
       Research Fund, Genome Analysis Laboratory. (G_Zehetner at icrf.ac.uk)
       Last update: Apr. 23, 1992.

Scott federhen at ncbi.nlm.nih.gov

