In article <450vam$2nh at grapevine.lcs.mit.edu>, <c_raffi at lcs.mit.edu> wrote:
> I am trying to write a perl script (well MacPerl actually...) which
> will take DNA sequences in ascii format and convert them to GCG
> format.
>> Before I reinvent the wheel I was wondering if anything like this already
> exists in perl, I am specifically looking for a perl implementation of
> the algorithim that generates the GCG checksum value for each sequence.
>> regards,
> chris dagdigian
>>--
>cdagdigian at genetics.com>c_raffi at ararat.lcs.mit.edu
I resently found a library of Perl routines which included the
following:
First the header for the library:
##############################################################################
# Header section start (L) Copyleft software
##############################################################################
# These are freely copiable, modifiable, mutatable codes for everybody.
# I don't believe in some people saying originality, patent, creativity,...
# whatsoever in the intellectuallity of human beings while working in academic
# labs with public funds.
# You can even sell, make profit, become millionare ... with mine, if you can.\
.
# as long as you like Science and are serious in it (not just to make a living\
# or as a career).
# There might be other people's copyrighted codes(without notice) in this lib.\
# In that case it is my mistake as I never want to put any copyright protected\
# code here whether it is pd or commercial, without any labels. (you do NOT
# need to put my name in any of taken codes)
# For all the codes I write, I will TRY TO FIX any problems
# which have occurred while you used the programs. Contact me, I will
# do my best. I am the programmer and responsible for my codes.
# If you know Perl well, and find inefficient things from any of
# my code, please email me, as I am a student and for all my life so will be.
# My codes can be very verbose as they are for my own reference. Just cut them\
# off to make them neater.
#==============================================================================
# I AM NOT SUPPORTED BY MRC OR CAMBRIDGE U., so they
# will not have any right on mine. Essentially, as far as I know, any support \
# I get from them is based on accurate calculations by the Uni., so that I
# pay for everythig myself. Hence, I am sure all my work is free for you.
# I honestly hope any of mine is useful for any of you. All my codes are
# virtually co-authored by Tim Hubbard. - jong at mrc-lmb.cam.ac.uk
#------------------------------------------------------------------------------
# -- Tim Hubbard's student, Jong --
#<<< If you do not like references and do not understand them, please
#<<< email to jong at mrc-lmb.cam.ac.uk, so that I can make versions which
#<<< do not use non essential references.
###############################################################################
# Header section end (L) Copyleft software
###############################################################################
Now what you want.
#____________________________________________________________________________
# Title : write_gcg_genbank_file Copyright (C) 1993-1994 by James Tisdall
# Usage :
# Function :
# Argument :
# Returns : a ref. of an array for GCG-Genbank formatted sequence record
# Warning : stolen from Tisdall
#---------------------------------------------------------------------------
sub write_gcg_genbank_file { ##### PutGcgGenbank
my($seq,$header,$id) = @_;
my(@out,$len,$sum,$cnt,$i,$j);
$len = length($seq);
for($i=0; $i<$len ;$i++) {
$cnt++;
$sum += $cnt * ord(substr($seq,$i,1));
($cnt == 57) && ($cnt=0);
}
$sum %= 10000;
push(@out,"LOCUS $id\n");
push(@out,"DEFINITION $header\n");
push(@out,"ACCESSION $Accession\n");
push(@out,"ORIGIN\n\n");
push(@out," $id Length: $len (today) Type: N Check: $sum ..\n\n");
$len = length($seq);
$i = $#out + 1;
for($j = 0 ; $j < $len ; ) {
if( $j % 50 == 0) {
$out[$i] = sprintf("%8d ",$j+1);
}
$out[$i] .= sprintf("%s",substr($seq,$j,10));
$j += 10;
if( $j < $len && $j % 50 != 0 ) {
$out[$i] .= " ";
}elsif($j % 50 == 0 ) {
$out[$i++] .= "\n";
$out[$i++] = "\n";
}
}
if($j % 50 != 0 ) {
$out[$i] .= "\n";
}
$out[$i] .= "\n";
return \@out;