gap4: entering template information

James Bonfield jkb at mrc-lmb.cam.ac.uk
Tue Jan 26 07:01:31 EST 1999

In article <36AC5881.89E8D9B8 at Genetik.Uni-Bielefeld.DE> Frank-Joerg Vorhoelter <Frank.Vorhoelter at Genetik.Uni-Bielefeld.DE> writes:
>I do not
>know how to tell gap4 which read origins from which template. When I
>enter reads through pregap, for every read a new template is created.

Gap4 simply uses whatever text is in the TN field of the Experiment
File. If there is none, then it assumes the reading is from a template
by itself.

To get the TN field into the experiment file you'll need to add some
reading name parsing rules to either pregap or pregap4 (pregap's
successor). See the online documentation for clues on how to do
this. Alternatively, if you prefer, just hack a perl or awk script to
take the ID line, do some munging on it, and append a TN line for each
experiment file. Gap4 doesn't care how they got there. While you're at
it you will also need to provide SI (Size of Insert) lines in the
format of (for example) "SI   1400..2000" and PR lines (primer
type). The primer type is used to distinguish forward and reverse
reads from one another. "PR   1" is universal forward primer, and "PR
2" is universal reverse (with 3 and 4 being custom forward/reverse).

If you've got a Gap4 database with lots of sequences already
assembled, you have two choices. The first is to use extract readings
(directed assembly format) to produce some new experiment files, then
hack some scripts to add the experiment file lines, and then use
directed assembly in a new database to enter them back again. The
second method would be to use the Gap4 scripting language to directly
modify the records. This isn't trivial and can lead to inconsistencies
if you do things wrongly, but it's the most flexible and powerful way
of modifying data within an existing project.

James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net