This is a summary of the promoter files I have collected so far.
It has Eukaryotic Promters but not Prokaryotic Promoters.
If more arrive I will sumarize again.
Dan Jacobson (danj at jhuhyg.sph.jhu.edu) sent the following information about
the Euk. Promoter Database. The '*'s are below the files I have looked at.
=============================================================================
Host fly.bio.indiana.edu
Location: /molbio/data
FILE -rw-r--r-- 94156 May 14 1991 euk-promoter.dat
Description of each site in Database but no Sequences.
FILE -rw-r--r-- 46615 May 14 1991 euk-promoter.doc
Manual
Host modl.unibas.ch
Location: /biology/database
DIRECTORY drwxr-xr-x 512 Sep 18 17:44 epd
Location: /biology/database/epd
FILE -rw-r--r-- 240952 Dec 27 12:47 epd.dat
FILE -rw-r--r-- 49665 Dec 27 12:47 epduser.txt
Host ncbi.nlm.nih.gov
Location: /repository/EPD/asn
FILE -rw-r--r-- 1154526 Aug 30 00:41 epd28.asn
FILE -rw-r--r-- 388444 Aug 30 00:39 epd28.bin
FILE -rw-r--r-- 1183577 Dec 17 16:07 epd29.asn
FILE -rw-r--r-- 398119 Dec 17 16:10 epd29.bin
Location: /repository/EPD/db
FILE -rw-r--r-- 236892 Aug 22 17:46 epd28.dat
FILE -rw-r--r-- 49307 Aug 22 17:47 epd28.doc
FILE -rw-r--r-- 240950 Dec 17 16:13 epd29.dat
* Manual that describes database.
FILE -rw-r--r-- 49664 Dec 17 16:13 epd29.doc
* Database that describes the complete sequences that
* the promoters came from.
* FILE -rw-r--r-- 676489 Dec 17 16:05 epd29.seq
* This one contains actual promoter sequences.
Location: /repository/EPD/ssa
FILE -rw-r--r-- 77657 Aug 30 01:44 epd28.chk
FILE -rw-r--r-- 79520 Dec 17 16:39 epd29.chk
===============================================================================
This is the README file from Host : ncbi.nlm.nih.gov
Directory : /repository/EDP/db
:::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::Contents of the EPD directory:
Subdirectory db:
- epd28.dat EPD database release 28
- epd28.doc EPD user manual release 28
- epd29.dat EPD database release 29
- epd29.doc EPD user manual release 29
- epd29.seq EPD release 29 sequence data in FASTA format
Notes: - The original EPD format defines promoter sequences
indirectly by pointers to EMBL sequence data. These
pointers are only correct for EPD and EMBL versions
of the same release.
- The sequence headers in epd29.seq are of the following
type:
>EPD17001 (+) Pv snRNA U1; range -499 to 100.
The sequence identifier consists of the acronym EPD
followed by the corresponding EPD entry code. The
plus sign in parentheses reflects the "independent
subset status" as described in the EPD user manual.
For statistical analysis, it is recommended to use only
those sequences exhibiting the string "(+)" in the header
line.
Subdirectory asn:
- epd28.asn EPD release 28 in ASN.1 print value format
- epd28.bin EPD release 28 in ASN.1 binary value format
- epd29.asn EPD release 29 in ASN.1 print value format
- epd29.bin EPD release 29 in ASN.1 binary value format
- asn.all NCBI ASN.1 definitions
- EPDtoASN.f FORTRAN program for epd.dat to epd.asn conversion
Notes: - The ASN.1 version of EPD is not equivalent to the original
EPD database. Certain types of information are missing.
- This EPD version is based on the general NCBI ASN.1
definitions contained in file asn.all .
Subdirectory ssa:
- SSASYSM.doc SSA sequence-retrieval programs installation notes
- SSAUSRM.doc SSA sequence-retrieval programs documentation
- FromFPS.for Main program FromFPS
- PRDSM.for Main program PRDSM
- RDBSQ.for Subroutine RDBSQ
- REMBL.for Subroutine REMBL
- STREV.for Subroutine STREV
- epd28.chk Test output file for EPD release 28 (see SSASYSM.doc)
- epd29.chk Test output file for EPD release 29 (see SSASYSM.doc)
Notes: These are the sequence retrieval programs mentioned in the EPD
user manual which run in a VAX/VMS/UWGCG environment.
Philipp Bucher, December 17 1991
===============================================================================
Jim Studier (studier at ninja.life.uiuc.edu)