IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Scripted Directed Assembly Problem-

major major at genome.wi.mit.edu
Fri Mar 29 17:33:12 EST 2002

Hello Staden Group-

I'm having a problem getting a VERY large directed assembly to build

We use Staden.2000.0 currently.

I have 73,574 reads which comprise 27 contigs in an assembly.  When I
run the directed assembly graphically from gap4, the gapDB is built, but
painfully slowly(I've never let it runt o completion on this large data
set).  When I use a modified assemblye4
script(http://www-genome.wi.mit.edu/personal/major/assemble4), I get
this error:

Processing number 7995: G59P61559FC1.T0
Fri 29 Mar 14:27:21 2002 SYSMSG : No such file or directory [2]
Fri 29 Mar 14:27:21 2002 ERROR  : invalid type [1001]
Fri 29 Mar 14:27:21 2002 COMMENT: reading record 0
Fri 29 Mar 14:27:21 2002 FILE   : gap-io.c:171
Gap4 has found an unrecoverable error - These are usually bugs.
Please email all bug reports to staden-package at mrc-lmb.cam.ac.uk.
/home/strontium/major/.lsbatch/1017427958.375492: 12171 Memory fault -
core dumped

*Note* when run on assemblies with < 8000 reads, this builds a valid
gapDB with no problems.

When running the directed assembly via the gap4 GUI, I start gap4 with
-maxseq 2100000 -maxdb 100000, then create a new DB and start the
Directed assembly.  I've let it work to read number 20,000 before
quitting the program.(very, very slow)  Via this script, I can never get
it past read 8,000.  Isn't 8,000 what the maxdb defaults to?  How do I
set this to a larger number when using scripts to build gapdbs?  

I've tried opening up gap4 with the appropriate maxdb/seq values, saving
an empty gapDB, then having the script use that DB as a starting
database, but I still get the failure at reads 8,000.

I've also tried to modify line 12 of the assemble4 script to use the
-maxdb and -maxseq flags, but this just causes the script to open to a

This should be a minor fix, but I've spent a few days failing to get
this working due to what seems a simple preference problem...

John Major

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net