Serial Fragment Order from Fragment Overlap Data

MARK at cuccfa.bitnet MARK at cuccfa.bitnet
Thu Jan 17 15:53:00 EST 1991

Greetings from a non-subscriber!

        Does anybody out there know of available, working software
which can calculate serial fragment order from fragment pairwise
overlap data? The overlap data resulting from a Columbia University
scientist's new experimental techniques is purely binary in nature,
indicating whether any two fragments overlap or do not overlap. NO
sequence or length information is available for our fragments.

        I am aware of two papers from the past which addressed
precisely this problem:

        Waterman and Griggs (1986)
        Interval graphs and maps of DNA
        Bull Math Biol 48(2): 189-195

        Jungck, Dick, and Dick (1982)
        Computer-assisted sequencing, interval graphs, and molecular
        BioSystems 15: 259-273

The Cambridge statistician D.G. Kendall addressed similar problems
twenty years ago for purposes of seriation in archaeology.

        Thanks in advance for any responses! Please reply to me
directly (address at the end of this message).

-- Mark Reboul

P.S. -- If you care to read on, here are a few amplifications....

        From a novel experimental technique recently developed here
at Columbia University, we will get something like a total count of
fragments and a list of which pairs of them overlap. There will be
nothing fractional: Either a pair of fragments will overlap or it
won't. For now, we assume we will detect fragment overlaps without
error. The experiment will reveal nothing about the fragment
lengths, and thus we have no hope of determining serial order with
any measure of length (i.e., scale) built into it.

        Actually, the new lab technique involves two separate
libraries of clones (the clones are the sequence "fragments"), and
it is only possible to detect overlaps between clones from opposite
libraries. The two libraries span essentially the same stretch of
chromosome, so I think as long as we are careful to label the clones
distinctly, the two-library aspect of the data needs no other
special treatment.

        Another issue is whether the pattern of fragment overlaps
covers the entire stretch of DNA in a totally connected way. If
necessary, we will break down the overlap data into multiple
clusters of data, each describing a connected series of overlaps. We
would then run the ordering algorithm separately on each
self-connected cluster of overlap data.

        So that's our computational problem: to determine a
reasonable serial order of fragments from a knowledge of their


T. Mark Reboul, Senior User Services Consultant
Columbia University Comprehensive Cancer Center Computing Facility
College of Physicians and Surgeons, Room 1-420
630 West 168th Street
New York, NY 10032

Bitnet          mark at cuccfa
Internet        mark at cuccfa.ccc.columbia.edu    []

(212) 305-7360

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net