John Watson <watson_j at bms.com> writes:
>Does anyone know of a utility or script that will read a genbank-type
>file and automatically extract the coding regions as listed in the
>comments?
I have a Unix script, called gb2pep, that creates a peptide sequence
file from the peptide sequence included in the comments of a genbank
file that's been turned into gcg format with 'fromgenbank'. It was
written for us by Scott Rose, then at GCG, using gawk; you have to
'reformat' the results before using them.
Mike Hogan told me that we can redistribute the script. He says that
you may prefer the GCG program 'lookup', which can extract the coding
regions of the DNA sequence, ready for translation. I like gb2pep
because it's so simple and obvious to use. The script is available on
the web from
http://mbcr.bcm.tmc.edu/GCG/gb2pep.html
We're not providing any support for it, of course, but it's extremely
easy to use, in our experience.
________________________________________________________________________
Paula E. Burch, Ph.D. Molecular Biology Computational Resource
Baylor College of Medicine phone: (713)798-6023 fax: (713)798-4279
Houston, Texas 77030 internet: pburch at bcm.tmc.eduhttp://mbcr.bcm.tmc.edu/pburch.htmlwww at mbcr.bcm.tmc.edu