Peter Rice wrote:
><SNIP>
>> Sorry Guy - that may be close but it ain't close enough.
>> Try "fetch test.seq" in GCG and see whether you get the same checksum.
>> For example. the following modification to your code gives an answer
> of 2584 for gcg_chars but GCG reformat gives an answer of 7132. GCG's
> reformat has to be the authority here - anything except the
> reformatted value will be rejected by GCG programs.
>> Your version fails on simple lower case. It should still return 2160.
> That's before worrying about the other 'valid' GCG sequence characters.
><SNIP>
>
OK; it was a rather clumsy solution used for writing
sequences which I knew to contain only upper case A-Z.
Below is a revised function which works for the cases you mentioned.
Guy.
--
/* START REVISED EXAMPLE */
#include <stdio.h>
static int CheckSumGCG(char *seq){
register int i, ch, check = 0;
register char *index =
"--------------------------------------&---*---.-----------------"
"@ABCDEFGHIJKLMNOPQRSTUVWXYZ------ABCDEFGHIJKLMNOPQRSTUVWXYZ---~-"
"----------------------------------------------------------------"
"----------------------------------------------------------------";
for(i = 0; seq[i] != '\0'; i++)
if((ch = index[seq[i]]) != '-')
check += ((i % 57) + 1) * ch;
return check % 10000;
}
int main(){
register char *calm_human =
"adqlteeqiaefkeafslfdkdgdgtittkelgtvmrslgqnpteaelqd"
"minevdadgngtidfpefltmmarkmkdtdseeeireafrvfdkdgngyi"
"saaelrhvmtnlgekltdeevdemireadidgdgqvnyeefvqmmtak";
register char *test_seq =
"GCTGCCGCAGCGGCXGATGACAATAACRAYTGTTGCTGYGATGACGAYGA"
"AGAGGARTTTTTCTTYGGTGGCGGAGGGGGXCATCACCAYATTATCATAA"
"THAAAAAGAARTTGTTACTTCTCCTACTGTTRCTXYTAYTGYTRYTXATG"
"AATAACAAYCCTCCCCCACCGCCXCAACAGCARCGTCGCCGACGGCGGAG"
"AAGGCGXAGRMGAMGGMGRMGXTCTTCCTCATCGAGTAGCTCXAGYWSXA"
"CTACCACAACGACXGTTGTCGTAGTGGTXTGGXXXTATTACTAYGAAGAG"
"CAACAGSARTAATAGTGATARTRATRRABCDEFGHIJKLMNOPQRSTUVW"
"XYZ.~@&*abcdefghijklmnopqrstuvwxyz*@&~.";
register char *gcg_chars =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz.*~&@";
printf("Human Calmodulin GCG Checksum = %d\n",
CheckSumGCG(calm_human) ); /* 2160 */
printf("GCG test sequence Checksum = %d\n",
CheckSumGCG(test_seq) ); /* 3365 */
printf("GCG sequence characters Checksum = %d\n",
CheckSumGCG(gcg_chars) ); /* 7132 */
return 0;
}
/* END REVISED EXAMPLE */
--
----------------------------------------------------------------------
Guy St.C. Slater, Tel : (44) 1223 494 565
Human Genome Mapping Project Resource Centre, Fax : (44) 1223 494 512
Wellcome Trust Genome Campus, mailto:gslater at hgmp.mrc.ac.uk
Hinxton, Cambridge, CB10 1SB. http://www.hgmp.mrc.ac.uk/~gslater/
----------------------------------------------------------------------