My lab needed a program to integrate areas from data derived from our
HPLC, so I wrote a small program, called Peaks, to fill this need.
I am making the program available in case anyone else has a similar
need.
Peaks is compiled as a 32 bit command-line program for Microsoft
Windows 95 or NT.
Versions are available for Intel and DEC alpha processors.
Peaks takes data in the form of two columns of numbers, the first
column typically being time (X axis) and the second being some
value (Y axis). The program then detects peaks, integrates, and
subtracts out the baseline, according to user configurable parameters.
A major advantage of Peaks.exe is that it is free.
If it is useful to you, or you have comments, send me an e-mail.
Peaks version 0.5 may be downloaded by anonymous FTP from:
dogwood.botany.uga.edu
/pub/Malmberg
Peaks05.intel.zip is for an Intel processor.
Peaks05.alpha.zip is for a DEC alpha processor.
The Peaks.txt file is appended below.
Russell L. Malmberg
russell at dogwood.botany.uga.edu
=========================================================
Peaks.txt file
---------------------------------------------------------
Peaks version 0.5 -- Russell L. Malmberg -- 26 April 1996
=========================================================
Russell L. Malmberg
Botany Department
University of Georgia
Athens, GA 30602-7271
e-mail: russell at dogwood.botany.uga.edu
This is a 32 bit command line program for Windows 95 or Windows NT.
Peaks05.intel.zip is for an Intel processor.
Peaks05.alpha.zip is for a DEC alpha processor.
It is my first serious programming attempt in C++, so it probably has
bugs. The program is free, and I hope it is useful to you, but use it
at your own risk.
Please send me an e-mail with comments/bugs if you use the program and
find it useful at all.
Peaks was written to analyze data from our laboratory HPLC which
consisted of a time axis, and a y-axis, where we wished to integrate
areas under the various peaks in our data.
What it does:
=============
1. Peak detection:
The program moves along the X-axis calculating the slope of the
curve and a moving average of the slope. The program detects the
start of a peak when the moving average of the slope rises above
some positive value (OnSlopeValue); it records the peak time as
when the moving average of the slope goes from positive to zero to
negative; it then detects the end of a peak when the moving average
of the slope rises above some negative value (OffSlopeValue).
In addition to the slopes, the program can use absolute threshold
values (OnThresholdValue and OffThresholdValue) as triggers for the
start and stop of a peak. The threshold values are useful only if
the baseline does not rise or fall significantly.
The user can set the number of points in the moving average of the
slope (PtsMovAvgDer)to adjust for how much noise is in the data
set.
2. Integration:
The program computes the area under the detected peaks, and
estimates the baseline area by comparison to points taken just
before and after the beginning and ending of the peak, and also by
comparison with a curve fitted to the baseline.
The program reports the Area after baseline subtraction, as well as
the unadjusted area and the baseline area used.
You can prevent tiny peaks from being reported by adjusting the
minimum area to be reported (MinPeakArea).
The curve fitted to the baseline is an exponential curve, whose
parameters are reported in the output file. This is useful if you
have a rising or falling baseline.
Known Bugs:
===========
The program is not bullet-proof in that if you give it a character
when it is expecting a number, or vice versa, it may crash the
program. If this seems to have happened, simply CTRL-C out.
If you set the [ArraySize] parameters in the Peaks.ini file much
smaller than the actual size of the data set, the program
sometimes, but not always, crashes.
I don't know why this happens.
Missing Features:
=================
Peaks does not (yet) produce any graphical output.
It does not (yet) have an internal standard calibration method
for peaks of known, defined, area.
Installation:
=============
0. Unzip.
1. Put Peaks.exe and Peaks.ini in a directory specified by the
PATH environment variable.
2. Set the [ArraySize] parameters in the peaks.ini file to be
larger than the number of rows of data in your files, and to be
larger than the maximum number of peaks you expect.
3. Do trial runs, changing the parameter values in the user dialog.
4. Modify peaks.ini to change [Integration] defaults to suit your
data. This will set the default values in the user dialog.
File Format:
============
Peaks computes the area under peaks on a curve from an input file in
this format:
-- Zero or more rows of header/comment (character) information.
followed by:
-- Two columns of data, the first column of which is X (time) and
the second column of which is the Y for the X.
Peaks detects the header/comment rows as not beginning with a
number.
If the first character in a row is a number, then Peaks expects that
everything from there on will be the two columns of data.
A sample file "sample.pks" is included.
Peaks.ini file:
===============
There is a peaks.ini file that sets the default values for the
program. The default values for integration parameters can be
modified at run time in the user dialog.
The format of the Peaks.ini file is similar to any standard
Windows .ini file, with value = number.
[Integration]
OnSlopeValue=1.0
OffSlopeValue=-0.6
OnThresholdValue=1.0
OffThresholdValue=0.2
PtsMovAvgDer=20
MinPeakArea=10.0
[ArraySize]
RowsOfData=10000
MaxNumPeaks=100
The [Integration] parameters are used in the user dialog as the
default values. Change these parameters in the .ini file when you
have a set of parameters that integrates your data in a reasonable
way. The [ArraySize] parameters should be 10% to 50% larger than
the actual numbers you have or expect from your data.
Methods of running Peaks.exe:
=============================
A. Type "peaks" at the command line in the directory that has your
file.
B. Type "peaks sample.pks" to start peaks with the specified file.
C. Associate some file extension (e.g. .pks) with peaks in the file
manager, then double click on the file to run.
(this is what we do).
D. Type "peaks sample.pks g" to start peaks with the specified file
and run immediately without the user dialogs.
E. Type "peaks -h" to get a brief help message.
Output file name:
=================
Peaks initially adds the ".out" extension to the file name, but this
can be changed in the user dialog.
Baseline:
=========
The program reports both the uncorrected area, and also does a
baseline subtraction. The baseline subtraction uses an average of
the points just before and after the peak.
The program also does a least squares fit of an exponential curve
to the baseline points. This is intended to be useful for rising or
falling baselines.
The program attempts to determine if a given peak is fused to
another peak and makes some adjustments for this, using the least
squares fitted curve to estimate the true baseline for each of the
fused peaks.
Output values for each peak:
============================
Peak number
Peak time
Peak Area minus Baseline area
Peak Area uncorrected
Baseline area
Average Baseline value at start of peak
Average Baseline value at end of peak
Time of start of peak
Time of end of peak
Methods used to start/stop peak (slope or threshold)
Output numbers are reported in tab delimited columns to
facilitate subsequent import into spreadsheet programs.
Acknowledgements:
=================
Special thanks to Mark Watson for suggestions and debugging.
Thanks to Don Gilbert and Tim Cutts for answering C++ questions.
The style of the opening user dialog was suggested by that
of Felsenstein's Phylip programs.
The research in the author's lab, the data from which is analyzed
by this program, is supported by a grant from US DOE Energy
BioSciences.