IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

generating GCG checksum value via perl

Mike Cherry cherry at stout.Stanford.EDU
Thu Oct 5 15:19:19 EST 1995


In article <450vam$2nh at grapevine.lcs.mit.edu>,  <c_raffi at lcs.mit.edu> wrote:
>  I am trying to write a perl script (well MacPerl actually...) which
>  will take DNA sequences in ascii format and convert them to GCG
>  format.
>
>  Before I reinvent the wheel I was wondering if anything like this already
>  exists in perl, I am specifically looking for a perl implementation of
>  the algorithim that generates the GCG checksum value for each sequence.
>
>  regards,
> chris dagdigian
>
>--
>cdagdigian at genetics.com
>c_raffi at ararat.lcs.mit.edu

I resently found a library of Perl routines which included the
following:

First the header for the library:


##############################################################################
# Header section start              (L) Copyleft software
##############################################################################
 # These are freely copiable, modifiable, mutatable codes for everybody.
 # I don't believe in some people saying originality, patent, creativity,...
 # whatsoever in the intellectuallity of human beings while working in academic
 # labs with public funds.
 # You can even sell, make profit, become millionare ... with mine, if you can.\
.
 # as long as you like Science and are serious in it (not just to make a living\

 # or as a career).
 # There might be other people's copyrighted codes(without notice) in this lib.\

 # In that case it is my mistake as I never want to put any copyright protected\

 # code here whether it is pd or commercial, without any labels. (you do NOT
 # need to put my name in any of taken codes)
 # For all the codes I write, I will TRY TO FIX any problems
 # which have occurred while you used the programs. Contact me, I will
 # do my best. I am the programmer and responsible for my codes.
 # If you know  Perl well, and find inefficient things from any of
 # my code, please email me, as I am a student and for all my life so will be.
 # My codes can be very verbose as they are for my own reference. Just cut them\

 # off to make them neater.
#==============================================================================
 #         I AM NOT SUPPORTED BY MRC OR CAMBRIDGE U.,        so they
 # will not have any right on mine. Essentially, as far as I know, any support \

 # I get from them is based on accurate calculations by the Uni., so that I
 # pay for everythig myself. Hence, I am sure all my work is free for you.
 # I honestly hope any of mine is useful for any of you. All my codes are
 # virtually co-authored by Tim Hubbard.  - jong at mrc-lmb.cam.ac.uk
#------------------------------------------------------------------------------
#                           -- Tim Hubbard's student, Jong  --
#<<< If you do not like references and do not understand them, please
#<<< email to jong at mrc-lmb.cam.ac.uk, so that I can make versions which
#<<< do not use non essential references.
###############################################################################
# Header section end              (L) Copyleft software
###############################################################################

Now what you want.


#____________________________________________________________________________
# Title    : write_gcg_genbank_file  Copyright (C) 1993-1994 by James Tisdall
# Usage    :
# Function :
# Argument :
# Returns  : a ref. of an array for GCG-Genbank formatted sequence record
# Warning  :  stolen from Tisdall
#---------------------------------------------------------------------------
sub write_gcg_genbank_file {                    ##### PutGcgGenbank
  my($seq,$header,$id) = @_;
  my(@out,$len,$sum,$cnt,$i,$j);
  $len = length($seq);
  for($i=0; $i<$len ;$i++) {
    $cnt++;
    $sum += $cnt * ord(substr($seq,$i,1));
    ($cnt == 57) && ($cnt=0);
  }
  $sum %= 10000;
  push(@out,"LOCUS $id\n");
  push(@out,"DEFINITION $header\n");
  push(@out,"ACCESSION $Accession\n");
  push(@out,"ORIGIN\n\n");
  push(@out," $id Length: $len (today)  Type: N Check: $sum  ..\n\n");
  $len = length($seq);
  $i = $#out + 1;
  for($j = 0 ; $j < $len ; ) {
    if( $j % 50 == 0) {
      $out[$i] = sprintf("%8d  ",$j+1);
    }
    $out[$i] .= sprintf("%s",substr($seq,$j,10));
    $j += 10;
    if( $j < $len && $j % 50 != 0 ) {
      $out[$i] .= " ";
    }elsif($j % 50 == 0 ) {
      $out[$i++] .= "\n";
      $out[$i++] = "\n";
    }
  }
  if($j % 50 != 0 ) {
    $out[$i] .= "\n";
  }
  $out[$i] .= "\n";
  return \@out;



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net