We are pleased to announce the availability of GenFrag 2.2.1, a set of
utilities to generate artificial DNA data sets for the purpose of testing
software developed to support large-scale sequencing projects. The goal
of this code is to provide a robust source of reproducible DNA fragment
sets with systematically and independently varied characteristics.
GenFrag takes an input sequence, optionally inserts user-specified
repeat templates, and then fragments this sequence according to a
user-specified criteria. The resulting fragments can be used directly
in a downstream application (e.g., a sequence assembly program), or
first mutated using a flat error rate or an error profile. Finally, a
figure displaying the layout of the fragments relative to the input
sequence can be generated in X11 or PostScript format. A number of log
files are created that contain the parameters used to generate the data
as well as the information needed to produce the correct arrangement of
the fragment set. Some of the characteristics that can be manipulated
are repeat complexity, fragment length, mean depth of coverage on the
parent, and error distribution.
GenFrag 2.2.1 is functionally identical to beta release 2.2 with the
following exception. If a fragment encountered by mutate exceeds 600 bp,
the length of the default error profile, an error is reported and the
program exits. The README file provided with the software describes some
modifications that can be made to allow one to model errors for longer
One problem noticed by several users was that the sample error
distribution file had the wrong number of columns; this oversight has
Engle, M.L. and Burks, C. (1994). "GenFrag 2.1: New Features for More
Robust Fragment Assembly Benchmarks." CABIOS 10:567-568.
Engle, M.L. and Burks, C. (1993). "Artificially Generated Data Sets
for Testing DNA Fragment Assembly Algorithms." Genomics 16:286-288.
GenFrag can be retrieved several ways:
1.) Electronic mail: send an e-mail message to "bioserve at t10.lanl.gov"
that contains the single word "genfrag" in the message text.
2.) Anonymous ftp: ftp to "t10.lanl.gov", use login name "anonymous",
send your email address at the password prompt, cd to "/pub/genfrag",
type "binary" and then "get genfrag-2.2.1.tar.Z".
3.) WWW: open URL "ftp://t10.lanl.gov/pub/genfrag/genfrag-2.2.1.tar.Z".
Michael L. Engle
Los Alamos National Laboratory email: mle at t10.lanl.gov
MS K710, Group T-10 phone: (505) 665-2598
Los Alamos, NM 87545 FAX: (505) 665-3493