molecular weight

mathog at seqaxp.bio.caltech.edu mathog at seqaxp.bio.caltech.edu
Fri Mar 1 14:32:28 EST 1996

In article <J.Parkhill-2802960943480001 at bcs113.bham.ac.uk>, J.Parkhill at bham.ac.uk (Julian Parkhill) writes:
>In article <312B5932.7030 at purdue.edu>, Rick Westerman <westerm at purdue.edu>
>> > What program should I use in GCG if I want to calculate molecular weight
>> > of some nucleotide sequence? (for example oligo)
>> I have defined a symbol called "peptideweight" (for VMS):
>> For the UNIX version you could define an appropriate symbol or just run 
>> the PeptideSort program manually.
>> As an aside to Peter: Telling a person to run a program that they don't 
>> have (e.g. "nip") isn't very useful.
>Equally, it is not particularly helpful to suggest using Peptidesort to
>calculate the MW of a nucleotide sequence.
>A good approximation for the MW of nucleotides is 330 per base, or 660 per
>base-pair for d.s. DNA.  That should be OK for most purposes.

I have modified COMPOSITION to do this calculation.  The modified version has 3
extra command line options: 

  /MW     do MW instead of di and trinuc frequencies
  /RNA    MW of RNA (T and U -> U). default is DNA (T and U -> T)
          No 5' phosphate  (default has a 5' Phosphate)

Use /MW/both for DS weights.

If you have a valid GCG license, you may pick up the modified version.
Anonymous FTP to seqaxp.bio.caltech.edu, then issue the commands:

   GET [.PICKUP]composition.for composition.for
   GET [.PICKUP]composition.cmd composition.cmd

and build it on your system (you won't be able to DIR or LS into [.PICKUP].
On VMS I built this with: 

 $ for/extend/nolis composition
 $ genlink composition

Please report any bugs back to me and I'll fix them. (This version has not
been extensively tested.)

Known limitations:

1  Weights are stored as reals, not doubles, so don't expect the last few
   digits to be correct if you calculate the weight of a YAC! 

2  $ composition/infile=*.seq/mw

   will give the SUM molecular weight, rather than the individual 
   weights for each sequence, which is probably not what you want.

3. There is no standard way to use this on sequences with oddball RNA bases.
   (However, if you need that, it would be easy to add those values to
   the weight table, and then use the degenerate bases to indicate the 
   alternate RNA bases.)


David Mathog
mathog at seqaxp.bio.caltech.edu
Manager, sequence analysis facility, biology division, Caltech 

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net