IUBio

HEXON:Prediction of internal exons in Human DNA sequences

Dan Davison dbd at THEORY.BCHS.UH.EDU
Tue Jul 26 17:38:42 EST 1994


      The Baylor College Of Medicine Computational Biology Group
			     Houston, TX
		       announces a new service

				HEXON
	 Prediction of internal exons in Human DNA sequences


NOTE: This service is temporarily being provided through the
University of Houston Gene-Server.  Only two jobs will be run at a
time.

Analysis of uncharacterized human sequences is available by sending a
file containing a sequence name and a sequence (no more than 80 
characters/line) to:

       service at theory.bchs.uh.edu       

with the subject line "HEXON". 

Example: mail -s HEXON service at theory.bchs.uh.edu < test.seq

where test.seq a file with the sequence.
 
Method description:
**********************
   The method is based on the discriminant analysis of open reading 
   frames flanked by GT and AG base pairs. Prediction is performed by
   linear discriminant function combining characteristics describing
   donor and acceptor splice sites, 5'- and 3'-intron regions and also
   coding region for each  open reading frames flanked by GT and AG base
   pairs.   
   Current version of the program predict only internal exons with GT and  
   AG conserved base pair for donor and acceptor splice sites, respectively. 
   They are usualy include more than 99% of the all authentic splice sites.
   Further versions of the program will have option for other variants 
   and extention for other species.  

Accuracy:
********************************
  The accuracy of precise internal exon recognition on test set of 451 exons 
  is 77% with a specificity of 79%. The recognition quality computed at the  
  level of individual nucleotides is 88% for exons sequences with the   
  level 98% for intron sequences. This corresponds to a correlation coeffi-
  cient of 0.87.

 If you have a sequence containing a complete gene,  5'- or 3'-exons, you can
use gene structure prediction program FGENEH or exons recognition program 
FGENEH from the server.

Submitting sequences via email:
***********************************
  For email submission the sequences must have the following format:  

Nane of your sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............

   (Restrict the line length to 80 characters or less).

   You could send the file containing the sequence to: 

   service at theory.bchs.uh.edu
   Subject line must be:
   hexon

Hexon output:		
******************
   1st line - name of your sequence
   2nd line - length of your sequence
   3d line - number of potential exons
   4th line and next - positions and scores of predicted exons 
   It must be mentioned that some predicted exon may be partially 
   coinciding with real exons and the higher the score of an exon 
   the more probably it is a precise authentic exon.
   For example:

   HUMALPHA     4556 bp ds-DNA             PRI       15-SEP-1 
   length of sequence -   4556
   number of potential exon:  10
   380 -    516 w=  9.66
   611 -    727 w= 17.13
   839 -    954 w= 21.43
  1147 -   1321 w=  9.08
  1819 -   1953 w= 12.21
  2053 -   2125 w= 14.77
  2254 -   2388 w= 12.91
  2470 -   2661 w= 10.12
  2881 -   2997 w= 17.17
  3120 -   3562 w=  5.87

References:
  The method is described in detail in:
  Solovyev V.V.,Salamov A.A., Lawrence C.B.
   Predicting internal exons by oligonucleotide composition and 
   discriminant analysis of spliceable open reading frames. 
  (Nucl.Acids Res.,1994, in press).

   Solovyev V.V., Salamov A.A. , Lawrence C.B.
   The prediction of human exons by oligonucleotide composition and 
   discriminant analysis of spliceable open reading frames.
   in: The Second International conference on Intelligent systems
   for Molecular Biology (eds. Altman R., Brutlag D.,
   Karp R., Latrop R. and Searls D.), AAAI Press, Menlo Park, CA 
   (1994, in press) 

Problems, comments, and suggestion:
   can be mailed to solovyev at cmb.bcm.tmc.edu.
   





More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net