IUBio

announce: Peaks 0.5 integrates areas

Russell L. Malmberg russell at dogwood.botany.uga.edu
Mon Apr 29 09:48:28 EST 1996


My lab needed a program to integrate areas from data derived from our
HPLC, so I wrote a small program, called Peaks, to fill this need.
I am making the program available in case anyone else has a similar
need.

Peaks is compiled as a 32 bit command-line program for Microsoft
Windows 95 or NT.
Versions are available for Intel and DEC alpha processors.

Peaks takes data in the form of two columns of numbers, the first
column typically being time (X axis) and the second being some
value (Y axis).  The program then detects peaks, integrates, and
subtracts out the baseline, according to user configurable parameters.

A major advantage of Peaks.exe is that it is free.
If it is useful to you, or you have comments, send me an e-mail.

Peaks version 0.5 may be downloaded by anonymous FTP from:
   dogwood.botany.uga.edu
   /pub/Malmberg
   Peaks05.intel.zip is for an Intel processor.
   Peaks05.alpha.zip is for a DEC alpha processor.

The Peaks.txt file is appended below.

Russell L. Malmberg
russell at dogwood.botany.uga.edu




=========================================================
Peaks.txt file
---------------------------------------------------------
Peaks version 0.5 -- Russell L. Malmberg -- 26 April 1996
=========================================================


     Russell L. Malmberg
     Botany Department
     University of Georgia
     Athens, GA 30602-7271

     e-mail: russell at dogwood.botany.uga.edu


This is a 32 bit command line program for Windows 95 or Windows NT.
     Peaks05.intel.zip is for an Intel processor.
     Peaks05.alpha.zip is for a DEC alpha processor.
It is my first serious programming attempt in C++, so it probably has
bugs. The program is free, and I hope it is useful to you, but use it
at your own risk.

Please send me an e-mail with comments/bugs if you use the program and
find it useful at all.

Peaks was written to analyze data from our laboratory HPLC which
consisted of a time axis, and a y-axis, where we wished to integrate
areas under the various peaks in our data.


What it does:
=============
1.  Peak detection:
   The program moves along the X-axis calculating the slope of the
   curve and a moving average of the slope.  The program detects the
   start of a peak when the moving average of the slope rises above
   some positive value (OnSlopeValue); it records the peak time as
   when the moving average of the slope goes from positive to zero to
   negative; it then detects the end of a peak when the moving average
   of the slope rises above some negative value (OffSlopeValue).
   In addition to the slopes, the program can use absolute threshold
   values (OnThresholdValue and OffThresholdValue) as triggers for the
   start and stop of a peak.  The threshold values are useful only if
   the baseline does not rise or fall significantly.
   The user can set the number of points in the moving average of the
   slope (PtsMovAvgDer)to adjust for how much noise is in the data
   set.
2.  Integration:
   The program computes the area under the detected peaks, and
   estimates the baseline area by comparison to points taken just
   before and after the beginning and ending of the peak, and also by
   comparison with a curve fitted to the baseline.
   The program reports the Area after baseline subtraction, as well as
   the unadjusted area and the baseline area used.
   You can prevent tiny peaks from being reported by adjusting the
   minimum area to be reported (MinPeakArea).
   The curve fitted to the baseline is an exponential curve, whose
   parameters are reported in the output file.  This is useful if you
   have a rising or falling baseline.


Known Bugs:
===========
  The program is not bullet-proof in that if you give it a character
    when it is expecting a number, or vice versa, it may crash the
    program.  If this seems to have happened, simply CTRL-C out.
  If you set the [ArraySize] parameters in the Peaks.ini file much
    smaller than the actual size of the data set, the program
    sometimes, but not always, crashes.
    I don't know why this happens.


Missing Features:
=================
  Peaks does not (yet) produce any graphical output.
  It does not (yet) have an internal standard calibration method
     for peaks of known, defined, area.


Installation:
=============
  0.  Unzip.
  1.  Put Peaks.exe and Peaks.ini in a directory specified by the
      PATH environment variable.
  2.  Set the [ArraySize] parameters in the peaks.ini file to be
      larger than the number of rows of data in your files, and to be
      larger than the maximum number of peaks you expect.
  3.  Do trial runs, changing the parameter values in the user dialog.
  4.  Modify peaks.ini to change [Integration] defaults to suit your
      data.  This will set the default values in the user dialog.


File Format:
============
  Peaks computes the area under peaks on a curve from an input file in
  this format:
   -- Zero or more rows of header/comment (character) information.
         followed by:
   -- Two columns of data, the first column of which is X (time) and
         the second column of which is the Y for the X.
  Peaks detects the header/comment rows as not beginning with a
      number.
  If the first character in a row is a number, then Peaks expects that
      everything from there on will be the two columns of data.
  A sample file "sample.pks" is included.


Peaks.ini file:
===============
  There is a peaks.ini file that sets the default values for the
  program. The default values for integration parameters can be
  modified at run time in the user dialog.
  The format of the Peaks.ini file is similar to any standard
  Windows .ini file, with value = number.

	[Integration]
	OnSlopeValue=1.0
	OffSlopeValue=-0.6
	OnThresholdValue=1.0
	OffThresholdValue=0.2
	PtsMovAvgDer=20
	MinPeakArea=10.0

	[ArraySize]
	RowsOfData=10000
	MaxNumPeaks=100

  The [Integration] parameters are used in the user dialog as the
  default values.  Change these parameters in the .ini file when you
  have a set of parameters that integrates your data in a reasonable
  way. The [ArraySize] parameters should be 10% to 50% larger than
  the actual numbers you have or expect from your data.


Methods of running Peaks.exe:
=============================
 A.  Type "peaks" at the command line in the directory that has your
       file.
 B.  Type "peaks sample.pks" to start peaks with the specified file.
 C.  Associate some file extension (e.g. .pks) with peaks in the file
       manager, then double click on the file to run.
       (this is what we do).
 D.  Type "peaks sample.pks g" to start peaks with the specified file
       and run immediately without the user dialogs.
 E.  Type "peaks -h" to get a brief help message.


Output file name:
=================
  Peaks initially adds the ".out" extension to the file name, but this
   can be changed in the user dialog.


Baseline:
=========
  The program reports both the uncorrected area, and also does a
  baseline subtraction.  The baseline subtraction uses an average of
  the points just before and after the peak.
  The program also does a least squares fit of an exponential curve
  to the baseline points.  This is intended to be useful for rising or
  falling baselines.
  The program attempts to determine if a given peak is fused to
  another peak and makes some adjustments for this, using the least
  squares fitted curve to estimate the true baseline for each of the
  fused peaks.


Output values for each peak:
============================
  Peak number
  Peak time
  Peak Area minus Baseline area
  Peak Area uncorrected
  Baseline area
  Average Baseline value at start of peak
  Average Baseline value at end of peak
  Time of start of peak
  Time of end of peak
  Methods used to start/stop peak (slope or threshold)
     Output numbers are reported in tab delimited columns to
     facilitate subsequent import into spreadsheet programs.


Acknowledgements:
=================
  Special thanks to Mark Watson for suggestions and debugging.
  Thanks to Don Gilbert and Tim Cutts for answering C++ questions.
  The style of the opening user dialog was suggested by that
     of Felsenstein's Phylip programs.
  The research in the author's lab, the data from which is analyzed
  by this program, is supported by a grant from US DOE Energy
  BioSciences.






More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net