The purpose of this note is to announce the availability of FASTLINK 2.3P.
FASTLINK is a faster version of the principal linkage analysis programs
in LINKAGE 5.1.
Thanks to Lucien Bachner, Carolyn Bucholtz, John Powell, Gyorgy Simon,
Jim Tomlin, and Garret Taylor for assistance with beta testing and
portability testing of the parallel code.
Thanks to Margaret Gelder Ehm, Carol Haynes, Patricia Kramer,
Toby Nygaard, Marcy Speer, Gerard Tromp, Frank Visser for
bug reports and suggestions that helped in developing version 2.3P.
Thanks to Anita Destefano and Kimmo Kallio for assistance with portability
to VMS.
As with previous versions, FASTLINK 2.3P can be ftp-ed from
softlib.cs.rice.edu in the directory
pub/fastlink
The main advance over FASTLINK 2.2 is that much of the code can now
run in parallel, which explains the P in the new version. Most of this
message will focus on the parallel code, but let me put some remarks
about the sequential code first, so that those who want only sequential
code can skip the rest.
|*| Sequential Code
---------------
Version 2.3P has some improvements in the sequential code
and documentation, which are covered at the end of README.updates.
New features include speeding up runs involving multiple LINKMAP scripts
that move a marker across a fixed map.
The organization of the code files and Makefile has changed substantially, so
that the same files can be used to make both sequential and parallel
executable files.
There is a new auxiliary program called ofm ("optimize for maxhap") to
assist with automatic recompilation of the programs.
|*| Parallel Code, Introduction
---------------------------
The (sequential) FASTLINK package already provides considerable running
time improvements over the older programs for the LINKAGE package.
Response from users about sequential FASTLINK has been extremely
enthusiastic and yet, it is abundantly clear that more speedup is necessary.
At this time, we believe that one realistic way to obtain substantially
more speedup on long runs is to use multiple processors in parallel.
We continue to investigate further sequential speedups.
Two attempts to parallelize ILINK from FASTLINK are described in the papers:
3. Sandhya Dwarkadas, Alejandro A. Schaffer, Robert W. Cottingham Jr.,
Alan L. Cox, Peter Keleher, and Willy Zwaenepoel, Parallelization of
General Linkage Analysis Problems, Human Heredity 44(1994),
pp. 127--141.
4. Sandeep K. Gupta, Alejandro A. Schaffer, Alan L. Cox, Sandhya
Dwarkadas, and Willy Zwaenepoel, Integrating Parallelization
Strategies for Linkage Analysis, Computers and Biomedical Research
28(1995), pp. 116-139.
These two papers are available as paper3.ps and paper4.ps with the
distribution. The version of parallel ILINK that we are distributing
is similar algorithmically to that described in the second paper. We
were able to achieve speedups in the 5 to 7 range on a network of 8
DECStation5000/Ultrix processors on ILINK runs that take tens of
minutes sequentially.
|*| Parallel FASTLINK, Operation
----------------------------
FASTLINK 2.3P can be run in parallel on two different types of platforms:
shared-memory multiprocessors and networks of UNIX workstations.
FASTLINK on shared-memory multiprocessors:
The shared-memory version uses the p4 macros which are available by
anonymous ftp to Argonne National Labs. More detailed retrieval and
installation instructions can be found in README.p4 that comes with
FASTLINK.
FASTLINK on network of workstations:
If you have access to a network of (uniprocessor) Unix workstations,
then you can run parallel FASTLINK using a runtime package called
TreadMarks. TreadMarks essentially provides the same execution
environment on a network of workstations as that available on
shared-memory multiprocessors.
TreadMarks is available for a small fee for universities and medical
schools, and at commercial rates for other institutions. All users
can get a 30-day free trial license. See README.TreadMarks for
more details on how to configure FASTLINK with TreadMarks.
TreadMarks licenses can be obtained by sending e-mail to
treadmarks at ece.rice.edu. Please specify the nature of your organization
(commercial or university/medical school) and the machine architecture
and operating system you plan to use TreadMarks for. Once you return the
signed license and the license fee, a copy of TreadMarks will be sent
to you or be made available via ftp. A free 30-day demo copy can also
be obtained by sending e-mail to the same address.
We recognize that installing the parallel code is more
difficult that installing the sequential code because of the need to
configure both the FASTLINK code and the runtime library (either p4 or
TreadMarks). We will be pleased to work with you in getting the parallel
code up and running on your system.
|*| Parallel FASTLINK, References
The main references for sequential FASTLINK are:
1. R. W. Cottingham Jr., R. M. Idury, and A. A. Schaffer, Faster Sequential
Genetic Linkage Computations, American Journal of Human Genetics, 53(1993),
pp. 252-263.
2. A. A. Schaffer, S. K. Gupta, K. Shriram, and R. W. Cottingham, Jr.,
Avoiding Recomputation in Genetic Linkage Analysis, Human Heredity,
44(1994), pp. 225-237.
5. G. M. Lathrop, J.-M. Lalouel, C. Julier, and J. Ott, Strategies for
Multilocus Analysis in Humans, PNAS 81(1984), pp. 3443-3446.
6. G. M. Lathrop and J.-M. Lalouel, Easy Calculations of LOD Scores
and Genetic Risks on Small Computers, American Journal of Human Genetics,
36(1984), pp. 460-465.
7. G. M. Lathrop, J.-M. Lalouel, and R. L. White, Construction of Human
Genetic Linkage Maps: Likelihood Calculations for Multilocus Analysis,
Genetic Epidemiology 3(1986), pp. 39-52.
One reference for p4 is:
8. R. Butler and E. Lusk. Monitors, Messages and Clusters: The p4
Parallel Programming System, Parallel Computing 20(1994), pp. 547-564.
One reference for TreadMarks is:
9. P. Keleher, A.L. Cox, S.Dwarkadas, and W. Zwaenepoel,
TreadMarks: Distributed Shared Memory on Standard Workstations
and Operating Systems, Proceedings of the Winter 94 Usenix Conference,
pp. 115-131, January 1994.
FASTLINK 2.3P represents the conjunction of 5 substantial research
efforts and software engineering projects. Therefore, if you use
FASTLINK in parallel, we ask that you cite:
at least one of 5,6,7 to give credit to LINKAGE
at least one of 1,2 to give credit to sequential FASTLINK
at least one of 3,4 to give credit for the parallel algorithms and
either 8 (p4) or 9 (TreadMarks) to give credit for the runtime library
that you use.
Sincerely,
Chris Hyams and Alejandro Schaffer and Alan Cox
and Sandhya Dwarkadas and Willy Zwaenepoel
Rice University
cgh at cs.rice.eduschaffer at cs.rice.edualc at cs.rice.edusandhya at cs.rice.eduwilly at cs.rice.edu