The Baylor College Of Medicine Computational Biology Group
Houston, TX
announces a new service
***************************************************************************
*********** NOTE ADDRESSES AND FORMATS HAVE CHANGED!! *********************
***************************************************************************
FEXH
(find exons)
Prediction of internal, 5'- and 3'- exons in Human DNA sequences
NOTE: This service is temporarily being provided through the
University of Houston Gene-Server. Only two jobs will be run at a
time.
Analysis of uncharacterized human sequences is available by sending a
file containing a sequence name as the first line and a sequence (with
no more than 80 characters/line) to
service at bchs.uh.edu
with the subject line "FEXH".
Example: mail -s FEXH service at bchs.uh.edu < test.seq
where test.seq a file with the sequence.
Method description:
**********************
Algorithm firstly predicts all internal exons in a given sequence
by linear discriminant function combining characteristics
describing donor and acceptor splice sites, 5'- and 3'-intron
regions and also coding region for each open reading frame flanked
by GT and AG base pairs. Potential 5'- and 3'- exons are predicted
by corresponding discriminant functions on the left side of the
first internal exon and on the right side from last internal exon,
respectively.
Accuracy:
***************
The accuracy of exon recognition have been estimated for a set
of 1016 exons from 181 complete genes.
It contains nucleotide sequences from -150 bp before the first coding
region and until +150 bp after the last coding region.
Test: Fexh Grail-2
Exact exon prediction 70% 40%
Exon nucleotides 85%(0.84) 77%(0.76)
The numbers in () are the correlation coefficients.
It must be mentioned that this program does not assemble the predicted
exons and it is more reliable for a case of exon missing (for example
due to sequence errors). For a gene model prediction you can use
"fgeneh" program from the Gene-Server (it has a better accuracy for
complete gene structure prediction); or if you have only internal part
of a gene sequence, internal exons may be predicted by server "hexon"
program.
Submitting sequences via email:
********************************
For email submission the sequences must have the following format:
Name of the sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............
(Restrict the line length to 80 characters or less).
You have to send the file containing the sequence to:
service at bchs.uh.edu
The subject line must be:
Subject: fexh
Example: mail -s fexh service at bchs.uh.edu < test.seq
Fexh output:
*******************************
1st line - name of your sequence
2nd line - length of your sequence
3d line - number of potential exons
4th line and next - positions of predicted exons and their weights
For example:
HUMALPHA 4556 bp ds-DNA PRI 15-SEP-1
length of sequence - 4556
number of potential exon: 10
380 - 516 w= 9.10
611 - 727 w=11.10
839 - 954 w=12.33
1147 - 1321 w= 7.70
1819 - 1953 w= 7.90
2053 - 2125 w=12.51
2254 - 2388 w =6.66
2470 - 2661 w=10.11
2881 - 2997 w= 8.87
3120 - 3562 w= 9.92
Reference:
1. Solovyev V.V.,Salamov A.A., Lawrence C.B.
Predicting internal exons by oligonucleotide composition and
discriminant analysis of spliceable open reading frames.
(Nucl.Acids Res.,1994, in press).
2. Solovyev V.V., Salamov A.A., Lawrence C.B.
The prediction of human exons by oligonucleotide composition and
discriminant analysis of spliceable open reading frames.
in: The Second International conference on Intelligent systems
for Molecular Biology, (eds. Altman R., Brutlag D.,Karp R., Latrop R.
and Searls D.), AAAI Press, Menlo Park, CA 1994, (in press)
Problems, comments, and suggestion:
can be mailed to solovyev at cmb.bcm.tmc.edu.