FTP Gene-Finder with Protein Data Base Search

Victor V. Solovyev solovyev at cmb.bcm.tmc.edu
Sun Apr 28 17:31:49 EST 1996

	Gene-Finder with data protein data base analysis V.1.

			(Solovyev V.V., Salamov A.A.)
You can ftp from gc.bcm.tmc.edu 3 programs for Human and Bacterial
gene prediction. The programs better run on DEC_alpha and available from dec_...
directories. Variants for Sun OS 5 and OS 4 will be in sun5_.. and sun4_ dirs.

Make: ftp gc.bcm.tmc.edu
      cd pub
and then cd dec_fgenehb or dec_fexhb or dec_cdsb

copy the stuff from these directories to your separate directories.

Some additional information about Gene-Finder programs you can get from:

WWW:      http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html


FGENEHB prediction of gene structure with possible similarity analysis
 and FEXHB (find exons) - Prediction of internal, 5'-  and 3'- exons 
			in Human/mammalian DNA sequences 
These programs are similar with fgeneh and fexh, but use also analysis of similarity of 
potential exons with known proteins in a protein data base by FASTA (William Pirson) program
                                  (using the FASTA program must be based on FASTA COPYRIGHT). 
The accuracy FGENEHB and FEXHB is about 10% higher than FGENEH and FEXH for recognition of 
week exons. The programs are time consumming and we put a limit to analyse up to
50000 bases of your sequence by the current version for DEC and 20000 bp for SUN.

CDSB -prediction of bacterial genes (sequence length < 400000 bases)

to run any program locally:

fgenehb test.seq test.res or 
fexhb test.seq test.res   or 
cdsb test.seq test.res

where test.seq is the file with your sequence and test.res is the file with 
prediction results.

Sequence format in the file: 

Name of your sequence 

(The line length must be less than 80 letters). 



you have to copy all files (binary) from the following directories:

fexhb:    from dec_fexhb

fgenehb:  from dec_fgenehb

cdsb:     from dec_cdsb

also for fexhb and fgenehb you have to 

1) copy protein data base in fasta format
you can copy it from /public/owl file owl.seq

2) change 1-st line in file fastgbs to make
path to this file on your computer: (change underline part in the first line):

OWL non-redundant protein DB$0A/genome1/ftp/public/owl/owl.seq 5

Questions send through:http://dot.imgen.bcm.tmc.edu:9331/gene-finder/gf.html

or to solovyev at cmb.bcm.tmc.edu

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net