IUBio

Wordsearch anyone?

Peter Rice rice at embl-heidelberg.de
Thu Apr 28 11:02:39 EST 1994


In article <1994Apr25.180523.8976 at comp.bioz.unibas.ch>,
doelz at comp.bioz.unibas.ch (Reinhard Doelz) writes:
> Peter Rice (rice at embl-heidelberg.de) wrote:
> : I just had a complaint from a user that it crashed with integer overflow
> : in the SEARCHHIST routine. It turns out that the count of diagonals
> : searched had overflowed (2*10^9 maximum integer value reached). A quick
> : back-of-the-envelope calculation shows that for a 6000bp search sequence
> : against the full GenEmbl database the overflow is expected.
> 
> 1 KB: 
>          6-mers found: 2,000,000,000
>  Diagonals with words:   102,979,266
>       Total diagonals:   824,029,236
>    Sequences searched:       190,890
>              CPU time:      13:15.32
>                
> 6 KB: 
> 
>          6-mers found: 2,000,000,000
>  Diagonals with words:   452,715,951
>       Total diagonals: -1,610,140,090
>    Sequences searched:       190,890
>              CPU time:      30:18.15
> 
> If this is not an output or other error, I would think that already the 
> numbers of 6-mers is limited. 

That's right. On VMS too. The number of n-mers (Sum) is limited to
MAXHITS, or 2*10^9.

Looks like the same fix is needed for the number of diagonals (TotHits)
although in your case (Irix I suppose) it just overflowed. On VMS it crashes.

 -----------------------------------------------------------------------------
 Peter Rice, EMBL                             | Post: Computer Group
                                              |       European Molecular
 Internet:    Peter.Rice at EMBL-Heidelberg.DE   |            Biology Laboratory
                                              |       Postfach 10-2209
 Phone:   +49-6221-387247                     |       69012 Heidelberg
 Fax:     +49-6221-387306                     |       Germany



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net