IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

generating GCG checksum value via perl

Don Gilbert gilbertd at sunflower.bio.indiana.edu
Fri Oct 6 08:42:39 EST 1995


This is the gcg checksum in perl from Jong's source
  $len = length($seq);
  for($i=0; $i<$len ;$i++) {
    $cnt++;
    $sum += $cnt * ord(substr($seq,$i,1));
    ($cnt == 57) && ($cnt=0);
  }
  $sum %= 10000;

This is the gcg checksum in C that I use

  for (i = 0; i < seqlen; i++) {
    count++;
    check += count * to_upper(seq[i]);
                     ^^^^^^^^
    if (count == 57) count = 0;
    }
  check %= 10000;


Note that you will get different checksums if your sequence
has lowercase letters.  You need the to_upper equivalent
to get the sum that GCG software uses.
Perhaps the easiest perl way to do the uppercase of checksum is
to convert all the sequence to uppercase first, as with
  $seq = "\U$seq";
If you don't want that, try converting each letter to uppercase
before doing ord() on it.

- don

-- 
-- d.gilbert--biocomputing--indiana u--bloomington--gilbertd at bio.indiana.edu



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net