Alexander Kanapin wrote:
> I am looking for softwrare (source code) for simple parsing of GenBank
> CDS field. Does anybody know about such tools? I need some kind of
> automatic splicing - to read CDS data and produse new sequence string
> according to CDS coordinates of coding regions from entry sequence.
The XYLEM package is specifically designed for creating and
managing sequence database subsets. It includes a program
called FEATURES that, given a set of GenBank Accession
numbers or LOCUS names, extracts all features corresponding
to one or more feature keys (eg. CDS, mRNA, exon, etc.).
ANY legal feature can be parsed to yeild a sequence. Even
when features are scattered across SEGMENTED GenBank
entries (eg. exons of a large gene were sequenced, but
not introns) FEATURES will properly reassemble them,
according to join() statements in the Features Table
of the entry. For example, given the feature
/product="green visual pigment"
sequences from six different entries would be joined to
recreate the CDS.
FEATURES also allows extraction of sequence fragments
using Feature Table expressions . For example, give the
/note="flaD (sin) homologue; putative"
Any of the following expressions would return the same sequence:
M60287:/note="flaD (sin) homologue; putative"
Expressions allow you to extract sequence fragments even
when they are not explicitly annotated, or to work around
errors in annotation.
To learn more about XYLEM, or download Solaris binaries or
source code, see:
Brian Fristensky | Let me get this straight:
Department of Plant Science |
University of Manitoba | A company that dominates the desktop,
Winnipeg, MB R3T 2N2 CANADA | and can afford to hire an army of the
frist at cc.umanitoba.ca | world's best programmers, markets
Office phone: 204-474-6085 | what is arguably the world's LEAST
FAX: 204-474-7528 | reliable operating system?
http://home.cc.umanitoba.ca/~frist/ What's wrong with this picture?