Here is the GCG checksum algorithm:
public static long GCGchecksum( Bioseq seq, int offset, int seqlen)
{
int check = 0, count = 0;
byte[] ba= seq.toBytes();
for (int i = 0; i < seqlen; i++) {
byte b= ba[i+offset];
if (b>='a' && b<='z') b -= 32;
count++;
check += count * b;
if (count == 57) count = 0;
}
check %= 10000;
return check;
}
See also http://iubio.bio.indiana.edu/soft/molbio/readseq/java/
for a program that does sequence reformatting, including gcg single
and msf formats.
--
-- d.gilbert--bioinformatics--indiana-u--bloomington-in-47405
-- gilbertd at bio.indiana.edu