Greetings NetLanders:
In September, 1992 I sent the enclosed message out on the bulletin boards.
I never got any replies so I decided to tackle the job myself. The matrices
which I reformatted appear to operate without problem in the GCG FitConsensus
program. I have deposited the reformatted weight matrices on our VAX cluster
in the ANONYMOUS ftp account for public use. The enclosed message follows:
********************************************************************************
start of enclosed text
Has anyone heard of or done for themselves the conversion of Dr. Bucher's
weight matrix descriptions of eukaryotic promoter elements to GCG
Consensus.Csn format? These matrices are described in _J._Mol._Biol._
(1990) 212: 563-578. I am teaching a course this semester on computer
techniques in molecular biology and would very much like to use this data
to illustrate the power of weight matrix approaches versus simple
one-dimensional pattern matching. I realize that the conversion probably
is not that difficult, however, if it has already been done, it sure would
save me some time.
While on this topic, I've searched LIMB to see if there is a database of
weight matrix consensus descriptions and found nothing. Have any of you
heard of a collection of this type of data? I do have access to Dr.
Bucher's EPD database but am especially interested in weight matrix
patterns.
end of enclosed text
********************************************************************************
GCG has preassembled consensus weight matrices of the donor and acceptor site
sequences at exon-intron splice junctions for use with FitConsensus available
in their public data files. However, they do not provide any others;
therefore, I have reformatted the four weight matrix descriptions of eukaryotic
RNA polymerase II promoter elements reported by Bucher (1990) into a form
appropriate for GCG's programs. Additionally, McLauchlan et al. (1985)
assembled a eukaryotic terminator weight matrix which I have reformatted for
GCG use. These files have the following names: TATA.Csn, Cap.Csn, CCAAT.Csn,
GC.Csn and Terminator.Csn.
********************************************************************************
_References_
Bucher, P. (1990). Weight Matrix Descriptions of Four Eukaryotic RNA Polymerase
II Promoter Elements Derived from 502 Unrelated Promoter Sequences. Journal of
Molecular Biology 212, 563-578.
McLauchen, J., Gaffrey, D., Whitton, J. and Clements, J. (1985). The Consensus
Sequences YGTGTTYY Located Downstream from the AATAAA Signal is Required for
Efficient Formation of mRNA 3' Termini. Nucleic Acid Research 13, 1347-1368.
********************************************************************************
The directory structure and logon information for our anonymous account
follows. In addition to this Consensus subdirectory, the MolBio directory also
contains the subdirectory Profiles, which, in turn, contains several profile
subdirectories and their associated profile matrices which I deposited last
summer. Again, thanks for all of the support; I hope this data can be of some
use to you.
********************************************************************************
Internet address: bobcat.csc.wsu.edu 134.121.1.1
alias: wsuvms1.csc.wsu.edu
logon as: USER ANONYMOUS
password: your Internet address
********************************************************************************
path: root/molbio/consensus (however, this is a VMS site not Unix!)
Directory [ANONYMOUS.MOLBIO.CONSENSUS]
CAP.CSN CCAAT.CSN GC.CSN TATA.CSN TERMINATOR.CSN
and README.TXT (this file)
********************************************************************************
Steven M. Thompson
Consultant in Molecular Genetics and Sequence Analysis
VADMS (Visualization, Analysis & Design in the Molecular Sciences) Laboratory
Washington State University, Pullman, WA 99164-1224, USA
AT&Tnet: (509) 335-0533 or 335-3179 FAX: (509) 335-0540
BITnet: THOMPSON at WSUVMS1 or STEVET at WSUVM1
INTERnet: THOMPSON at wsuvms1.csc.wsu.edu
********************************************************************************