looking for restriction enzyme software

Brian Fristensky frist at cc.umanitoba.ca
Tue Aug 16 11:18:22 EST 1994

In article 5648 at cu23.crl.aecl.ca, doerffer at cu33.crl.aecl.ca (doerffer) writes:
> I am looking for the piece of public domain software to look for restriction enzyme sites in a nucleotide sequence. Could anybody help me in finding it.  If it does
> not exist as public domain, could anybody prompt me where I can buy it.  I am
> working on SunSparc 10 running SunOS 4.1.3 with OpenWindows 3.
> Thanks
> Kasia Doerffer
> ================================================================================
> Katarzyna (Kasia) Doerffer 
> Radiation Biology and
> Health Physics Branch                 Internet: doerffer at cu33.crl.aecl.ca
> AECL Research                         Phone: (613) 584-3311, x. 4031
> Chalk River Laboratories              FAX: (613) 584-1713
> Chalk River, Ontario
> Mail station:  51
> ================================================================================

Look at BACHREST and INTREST, which are part of the FSAP package. BACHREST
reads a file of restriction enzyme recognition sequences and searches for sites.
INTREST is the interactive version, in which the user types in names and
recognition sequences. For example, the output from a search of Bluescript
KSm13+ (ARBLKSP) looks like this:

BACHREST   Version  5/15/93
ARBLKSP  Configuration:  CIRCULAR  Length:       2958 bp
                                         # of
                                Cut     Sites Sites   Frags   Begin     End

{bunch of lines deleted so that we can look at some good examples....}

Esp3I     CGTCTC(1/5)                       0

FseI      GGCCGG^CC                         0

GsuI      CTGGAG(16/14)                     1
                                               2117    2958    2117    2116

HaeII     RGCGC^Y                           4
                                                382    1938    1402     381
                                                390     642     390    1031
                                               1032     370    1032    1401
                                               1402       8     382     389

HgiAI     GWGCW^C                           4
                                                658    1161    1472    2632
                                               1472     898    2718     657
                                               2633     814     658    1471
                                               2718      85    2633    2717

HindII    GTY^RAC                           1
                                                737    2958     737     736

HindIII   A^AGCTT                           1
                                                720    2958     720     719

Output is arranged in columns, with the name, recogition sequence and
cutting sites, followed by the number of sites found, and their locations.
The last three columns tell you the sizes of fragments from largest to
smallest, as they would appear on a gel, and the 5' and 3' ends of each

BACHREST can directly read Rich Robert's REBASE files. (REBASE is a database of
restriction enzymes, recognition sites, isoscizomers, commercial 
availability etc.) Output from BACHREST can be read by DIGEST, which
calculates the fragments generated by digestion with two or more enzymes.
Either complete or partial digests can be calculated.

One word of caution regarding circular sequences. The majority of plasmids
and other circular sequences in GenBank are not annotated as being
circular. To correctly generate digests of circular molecules, you 
have to make sure that there is a 'C' in colum 43 of the first line (LOCUS
line) of an entry. Also, when you do find a circular sequence that 
has not been annotated as such, PLEASE report it to NCBI, along with
LOCUS name and ACCESSION number, to update at ncbi.nlm.nih.gov.

The FSAP distribution also contains files necessary
for running BACHREST under Steven Smith's Genetic Data Environment (GDE)
If you are doing sequence analysis on a Sparcstation, you should 
definitely have GDE. Note that several other FSAP programs can now be run
under GDE:

   NUMSEQ - sequence numbering, translation etc.
   DxHOM,PxHOM - programs for 2D-matrix similarity comparison

The FSAP programs can be obtained by anonymous FTP to the 'psgendb' directory
at ftp.cc.umanitoba.ca

Brian Fristensky                | 
Department of Plant Science     |  A question is like a knife that slices
University of Manitoba          |  through the stage backdrop and gives us
Winnipeg, MB R3T 2N2  CANADA    |  a look at what lies hidden behind.
frist at cc.umanitoba.ca           |  
Office phone:   204-474-6085    |  Milan Kundera, THE UNBEARABLE LIGHTNESS 
FAX:            204-261-5732    |  OF BEING

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net