Hello Everybody -
In Article <342652A3.3F54 at icrf.icnet.uk>, Aengus Stewart
<aengus.stewart at icrf.icnet.uk> wrote (in part):
>I am getting a core dump - floating point exception when runnning 9.0
>FASTA against the Eukaryotic Promoter Database (EPD) from Phillip . . .
We too maintain EPD just for this sort of search and I hadn't tested it
against our GCG version 9.0 FastA until I saw Aengus' note. As Aengus
suggests the previous version, 8.1, worked just fine with EPD. However, the
Z-score normalization routine (which is new to GCG's version of FastA for
9.0) does crash with EPD in version 9.1, but on our system, SGI IRIX 6.2, it
does NOT produce a core dump, but the results aren't very pretty. I'll
include some of the output to give you all a flavor for what happens:
!!SEQUENCE_LIST 1.0
(Nucleotide) FASTA of: promoter.seq from: 1 to: 513 September 22, 1997 09:50
. . .
TO: epd:* Sequences: 1,285 Symbols: 771,000 Word Size: 2
Databases searched:
epd, Release 48.0, Released on 0Oct1996, Formatted on 0Jan1997
Searching with both strands of the query.
Scoring matrix: GenRunData:fastadna.cmp
Constant pamfactor used
Gap creation penalty: 16 Gap extension penalty: 4
. . .
Results sorted and z-values calculated from opt score
1501 scores saved that exceeded 2147483647
1265 optimizations performed
Joining threshold: 71, optimization threshold: 56, opt. width: 16
The best scores are: init1 initn opt z-sc E(0)..
EPD:EP030025 Begin: 327 End: 342 Strand: -
! E30025 Mm c-abl 6.5 kb E1; range -4... 44 44 44 nan0x7fffffff 0
EPD:EP030025 Begin: 505 End: 516
! E30025 Mm c-abl 6.5 kb E1; range -4... 42 42 42 nan0x7fffffff 0
EPD:EP030003 Begin: 241 End: 262 Strand: -
! E30003 Hs c-N-ras; range -499 to 10... 56 56 56 nan0x7fffffff 0
EPD:EP030003 Begin: 319 End: 370
! E30003 Hs c-N-ras; range -499 to 10... 72 72 73 nan0x7fffffff 0
. . .
EPD:EP016062 Begin: 454 End: 509
! E16062 Rn c-myc P2+; range -499 to ... 55 55 55 nan0x7fffffff 0
EPD:EP014067 Begin: 231 End: 256 Strand: -
! E14067 Mm c-myc P2+; range -499 to ... 48 48 49 nan0x7fffffff 0
EPD:EP014067 Begin: 239 End: 271
! E14067 Mm c-myc P2+; range -499 to ... 75 75 75 nan0x7fffffff 0
EPD:EP011146 Begin: 546 End: 561 Strand: -
! E11146 Hs c-myc P1; range -499 to 1... 44 44 44 nan0x7fffffff 0
EPD:EP011146 Begin: 423 End: 437
! E11146 Hs c-myc P1; range -499 to 1... 56 56 57 nan0x7fffffff 0
EPD:EP016061 Begin: 9 End: 45 Strand: -
! E16061 Rn c-myc P1; range -499 to 1... 59 59 65 nan0x7fffffff 0
EPD:EP016061 Begin: 94 End: 145
! E16061 Rn c-myc P1; range -499 to 1... 63 63 68 nan0x7fffffff 0
\\End of List
promoter.seq /rev
EPD:EP030025
ID EP030025 standard; DNA; EPD; 600 BP.
AC E30025;
DE Mm c-abl 6.5 kb E1; range -499 to 100.
CC Source: Eukaryotic Promoter Database / Release 48
CC Mm c-abl 6.5 kb E1 :+M ROD:MMABLC1B 1+ 665; 30025.
CC . . .
SCORES Init1: 44 Initn: 44 Opt: 44 z-score: nan0x7fffffff E():
0
75.0% identity in 16 bp overlap
199 189 179 169 159 149
promoter.seq AAAATTTTCCAACTTAAAATTAAATATATAAAAATATATTTTTAAATCAATATCTAACTT
|||| || ||||||
EP030025 GCGCTTCCTCATCTCTCACCTTGAGCTCAGAAAAGCTACCTTTAAAAGGTCGTGCGGAGC
300 310 320 330 340 350
As you can see, the search is probably valid, but the statistics are pretty
hard to interpret as it's darn difficult for Pearson's Expectation Function
to calculate a probability from a z-score of nan0x7fffffff :^)
Don't know what the problem is, maybe all the N's that Aengus suggests. But
this line is certainly suspect: "1501 scores saved that exceeded
2147483647." 2147483647 is WAY too big of an "opt" score - something screwy
IS going on.
Cheers - Steve
Steven M. Thompson
Consultant in Molecular Genetics and Sequence Analysis
Visualization, Analysis & Design in the Molecular Sciences (VADMS)
Washington State University, Pullman, WA 99164-4660, USA
AT&Tnet: (509) 335-3179 FAX: (509) 335-9688
INTERnet: thompson at ribozyme.vadms.wsu.edu