-----------------------------------------------------------------
Gene-Finder with data protein data base analysis V.1.
(Solovyev V.V., Salamov A.A.)
------------------------------------------------------------------
You can ftp from gc.bcm.tmc.edu 3 programs for Human and Bacterial
gene prediction. The programs better run on DEC_alpha and available from dec_...
directories. Variants for Sun OS 5 and OS 4 will be in sun5_.. and sun4_ dirs.
Make: ftp gc.bcm.tmc.edu
cd pub
and then cd dec_fgenehb or dec_fexhb or dec_cdsb
copy the stuff from these directories to your separate directories.
Some additional information about Gene-Finder programs you can get from:
WWW: http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html
*************************************************************************
FGENEHB prediction of gene structure with possible similarity analysis
================================================================
and FEXHB (find exons) - Prediction of internal, 5'- and 3'- exons
===================================================================
in Human/mammalian DNA sequences
These programs are similar with fgeneh and fexh, but use also analysis of similarity of
potential exons with known proteins in a protein data base by FASTA (William Pirson) program
(using the FASTA program must be based on FASTA COPYRIGHT).
The accuracy FGENEHB and FEXHB is about 10% higher than FGENEH and FEXH for recognition of
week exons. The programs are time consumming and we put a limit to analyse up to
50000 bases of your sequence by the current version for DEC and 20000 bp for SUN.
..........................................................................
CDSB -prediction of bacterial genes (sequence length < 400000 bases)
========================================================================
to run any program locally:
fgenehb test.seq test.res or
fexhb test.seq test.res or
cdsb test.seq test.res
where test.seq is the file with your sequence and test.res is the file with
prediction results.
Sequence format in the file:
Name of your sequence
ccatctctgtcttgcaggacaatgccgtcttctgtctcgtggggcatcctcctgctggca
ggcctgtgctgcctggtccctgtctccctggctgaggatccccagggagatgctgcccag
aagacagatacatcccaccatgatcaggatcacccaaccttcaacaagatcacccccaac
ctggctgagttcgccttcagcctataccgccagctggcacaccagtccaacagcaccaat
atcttcttctccccagtgagcatcg...............
(The line length must be less than 80 letters).
------------------------------------------------------------------
TO FTP:
you have to copy all files (binary) from the following directories:
fexhb: from dec_fexhb
fgenehb: from dec_fgenehb
cdsb: from dec_cdsb
also for fexhb and fgenehb you have to
1) copy protein data base in fasta format
you can copy it from /public/owl file owl.seq
2) change 1-st line in file fastgbs to make
path to this file on your computer: (change underline part in the first line):
OWL non-redundant protein DB$0A/genome1/ftp/public/owl/owl.seq 5
-----------------------
Questions send through:http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html
or to solovyev at cmb.bcm.tmc.edu