BIONET.GENE-LINKAGE
FREQUENTLY-ASKED-QUESTIONS
By Darrell Root
rootd at ee.pdx.edu
FAQ admin information
Where can I obtain the bionet.molbio.gene-linkage FAQ?
Who created the bionet.molbio.gene-linkage FAQ?
What other people contributed to this FAQ?
How can I help improve this FAQ?
What kind of information will never be contained in this FAQ?
Information Resources
What anonymous-ftp sites have programs/utilities useful for genetic
linkage analysis?
I think I know the name of a program I want, but I don't know where I
can find it
I have an ftp site with gene-linkage programs/utilities on it. How do I
get registered with the archie servers?
What gopher sites have useful genetic-linkage information?
What books are helpful when learning about genetic linkage analysis?
What genetic-linkage databases are available on the internet?
What is WWW?
What is Mosaic?
What is lynx?
I can telnet to the internet. Can I access the web?
What www sites have useful genetic-linkage information?
What "linkage centers" make information and assistance available to
researchers?
What journals are useful for genetic-linkage analysis?
Gene-linkage software overview
What database management programs do people use for
genetic-linkage data?
What programs are available for pedigree drawing?
Why are some programs used primairly for chromosome mapping,
while others are used for disease-mapping?
What programs are used for chromosome mapping?
What programs are used for disease-gene mapping?
What programs are available to help detect errors in linkage data?
What is Cyrllic?
Programs to assist in the recoding of genetic markers
Linkage package specific information
How do you calculate MAXHAP?
When should you use binary coding instead of numeric allele coding?
What is the effect of having allele frequencies not add up to 1, eg.
when some alleles are not present in a pedigree under study?
I use LINKAGE and/or FASTLINK. What references should I include
in my papers?
A discussion of recoding alleles in linkage analysis
Computer administration and optimization
How can I increase the speed of the linkage/fastlink package on my
workstation?
I set up 300 megs of paging space on my workstation, but now I'm
running out of hard-drive. Is there any way I can use my hard drive
space more effeciently?
But I don't know how to do all this optimization, and my research
assistant is spending all his/her time trying to figure it out.
How can I identify how much paging space is available on my
workstation?
File format specification and conversion
How do I convert between crimap and linkage formts?
How do I get my ceph data into crimap format?
Educational resources for teaching genetics
Genetics construction kit--fly genetics simulator
FAQ ADMINISTRATIVE INFORMATION
Where can I obtain the bionet.gene-linkage FAQ? [rootd;29may94]
It is available by anonymous-ftp from: ftp.ee.pdx.edu in
/pub/users/cat/rootd
The best way to view the faq is via the www, from
http://www.ee.pdx.edu/~rootd/gene-linkage.html
I also send the FAQ to news.answers, and to Dave Kristofferson, so it
should be included in the "standard" FAQ archives. Of course, I won't
be able to test that till after this goes out :-(
Who created the bionet.molbio.gene-linkage FAQ? [rootd;19nov94]
I am Darrell Root, and I'm editing this in my own time. Unfortunately,
I don't have all that much free time, so this FAQ is sorta haphazard and
has some obvious holes (for example, some of the "software packages
for linkage analysis" answers point out ftp sites which are not included
in the "ftp site list". In addition, I haven't double-checked much of the
information which I received from people (and I may have made a typo
or two), so if something appears incorrect, you're probably right.
Many thanks to everyone who sent me tons of information after the
FAQ revision 1. Unfortunately, that's when things started to get "busy"
and I'm just now doing the update (SIX MONTHS LATER). In
addition, I moved the faq www site from
http://www.ee.pdx.edu/rootd/gene-linkage.html to
http://www.ee.pdx.edu/~rootd/gene-linkage.html Sorry about that.
Tim Trautmann (timt at ee.pdx.edu) adapted the FAQ for www/Mosaic
use (before I learned html). He's responsible for all the wonderful
hypertext/ftp links. Great work Tim! (I'm afraid my hurried edits to
get this revision out have not been perfect, and the FAQ's formatting is
a little messed up--this is entirely my fault due to my haste: timt's
formatting was perfect...)
This FAQ is not perfect, in fact, it's not even pretty. During my 18
months doing linkage analysis work, I searched the net trying to find
stuff, and used up a bunch of time. This FAQ is sufficiently
disorganized that it may take you half-a-day to sort through it, but I
hope that will save you some time.
On a personal note, I'm continuing my career as a system
administrator, and am no longer doing genetic linkage analysis. If I
have time, I'll incorporate corrections/additions that people email me
(rootd at ee.pdx.edu), but I'm not actively searching/editing the faq. In
addition, someone who is doing linkage analysis would almost
certainly do a better job (assuming they have the time :-). For this
reason, I'm placing this FAQ in the public domain so anyone who
wants to take over editing it can do so without restriction. If you have
the time, and want to be a FAQ maintainer, send me some email.
My eternal thanks to those who sent me information. My repeated
apologies for not updating the FAQ for six months.
What other people contributed to this FAQ? [rootd;21may94]
Matthias Wjst sent us tons of useful material
David Kikuchi pointed out the genbank gopher sites
Pierre Janssens forwarded me some usenet answers, and
described Cyrillic.
Bennett Dyke provided information on his version of peddraw.
Michael Boehnke supplied a postal address to obtain simlink.
Don Bowden gave us a lead in finding a .gen->linkage
converter.
Young B Choi posted a list of journals to the net.
Robert Stodola sent us info about the chlc.
Ellen Wijsman gave a nice answer to the allele frequency
question on the net.
Jurg Ott sent us tons of corrections, clarifications, and new
information.
Peter Doris helped identify a problem with our ftp site.
David Adler told us about the idiograms at the University of
Washington.
Tim Littlejohn posted a gopher site with conference schedules.
John Attwood put his ceph2cri program on the net.
Rob Harper posted how people can use telnet to access the
WWW.
David Featherstone sent info about fastlink on SGI's, and on
CRI-MAP
Dave Curtis posted about DOLINK, the automatic recoder.
Kim Worley posted a web site.
Mike Miller sent me some info on LABMAN and LINKMAN.
Tara Matise sent me 18 separate pieces of information! Thanks!
Eli Meir posted about the Genetics Construction Kit (fly
genetics simulator).
I'm afraid some other people sent me stuff, some of which was
included, and some of which was lost (been a hectic half-year). My
apologies. Feel free to send me some nasty email (or a correction, or to
claim credit for something.
How can I help improve this FAQ? [rootd;19may94]
Think back to the old times. What do you understand now, that you
didn't understand then? What lack of knowledge caused you to waste
the most time? What information would have helped you become
productive more quickly? Share your hard-earned lessons with others!
There are a couple areas where I'd like to specifically request
assistance:
1. Internet resources: there are tons of ftp/gopher/www sites out
there. Nobody knows them all. Help me compile a complete
list. Send me the site addresses and a brief description of what's
there.
2. File format conversion programs: I want programs to convert
between the diferent file formats (crimap's .gen, ped.out,
linkage, simlink, peddraw (mac) liped etc...) I'd like to compile
a "complete-set" of file conversion programs. I particularly
want source for Santosh Mishra's mkcrigen (ped.out -> .gen)
program.
3. An ftp site for crimap, simlink, mkcrigen, and the crimap
utilities package
4. Programs for manipulation, analysis, and comparison of .gen
files
5. I'd like plenty of "linkage-101" and "crimap-101" questions.
What did you waste most of your time on?
6. If somebody wants to formally specify some of the file formats,
and give a small example (or two) for each, I'd appreciate it.
What type of information will never be contained in this FAQ?
[rootd;21may94]
Conference schedules/information (too volatile for a FAQ, let the
journals handle it...but there's a nice gopher site in our gopher section
:-)
I sent you some information, and you either: didn't include it, or didn't
give me credit. What can I do? [rootd;29may94]
Oops. My mistake. I tried to keep a list of everyone and their
contribution, but didn't completely succeed (translation: I failed). My
apologies. Send me email and I will make appropriate corrections...
INFORMATION RESOURCES
What anonymous-ftp sites have programs/utilities useful for genetic
linkage analysis? [rootd;29may94]
corona.med.utah.edu keeps a UNIX version of LINKAGE
(Lathrop/Lalouel/Julier/Ott). They also keep have the PC
version, but it doesn't appear to have been updated since
July-1991.
york.ccc.columbia.edu has the PC and VMS versions of
LINKAGE, and also other programs such as HOMOG, LIPED,
SLINK, and some programs from Dr. Newton Morton (LDB,
MAP-LODS, POINTER). In addition, all Linkage Newsletters
are kept online.
softlib.cs.rice.edu has FASTLINK, the optimized C versions of
linkage(5.1) which continue to undergo massive improvements.
ftp.gdb.org has some stuff in
/non-gdb-data/NIH-CEPH-data/CEPH-DATA/src, including
possibly the CRI-MAP utility programs (by Todd Steinbrueck
of Helen Donis-Keller's lab).
genome1.hgen.pitt.edu has Multimap, a lisp-based expert
system for automated construction of genetic linkage maps
using the CRI-MAP program.
watson.hgen.pitt.edu has some stuff from Dan Weeks,
including his APM programs and SLINK. Here's the info he
sent me:
At watson.hgen.pitt.edu, you'll find the following
files in the pub directory after logging in via
anonymous ftp:
newapm.tar.Z contains the package of programs
for the Affected Pedigree Member (APM) Method of
Linkage Analysis.
slink.tar.Z contains the SLINK package of
programs for simulation of genetic data.
cintmax.tar.Z contains a modified version
of CILINK which permits the usage of
different map functions in computing the
likelihood.
simapm.tar.Z contains the SLINK-based
simulation program for the APM package.
This represents a hacked together package
which only runs under a Unix system. You
will need FORTRAN, Pascal, and C compilers
to use this package.
ftp.bchs.uh.edu has some useful IBM programs, including:
peddraw (a DOS pedigree drawing
program--completely different from the B. Dyke
MacIntosh peddraw 4.x)
fastmap produces a quick approxomation to multipoint
lod scores
dolink A DOS genetic database/analysis-setup program
easistat A simple DOS statistics package
easigraf Draws graphs of lod scores
ftp.gene.ucl.ac.uk has the above IBM programs, as well as the
ceph2cri program from John Attwood. ceph2cri reads your
ped.out file and creates a crimap .gen file for you.
ftp.chlc.org is the Cooperative Human Linkage Center's ftp
site.
prep.ai.mit.edu is the home of GNU (the free software
foundation) which produces free software (such as the gcc
compiler, and the emacs editor).
wuarchive.wustl.edu is the largest anonymous ftp-site on the
planet. They have the whole GNU/free software foundation
distribution, and tons of other stuff.
ftp.gdb.org has all the files for OMIM (online mendelian
inheritance in man) and GDB (genome-data-base). Searching
within the search program is much easier.
ftp.ncsa.uiuc.edu has telnet, gopher, and mosaic clients for
many different types of computers. Ever wonder where "ncsa
telnet" was from? This is it.
ftp.ee.pdx.edu in /pub/users/cat/rootd is where I put the latest
FAQ version, my linkage->peddraw sed/awk script, and any
other stuff that program authors decide to let me put on my ftp
site.
NOTE: crimap and simlink are not currently available from
anonymous ftp sites.
There are many more sites with useful stuff. Email information to
rootd at ohsu.edu and I will add them to this list.
I think I know the name of a program I want, but I don't know where I
can find it. [rootd;21may94]
There is a database program called archie, which maintains a list of all
files in registered anonymous-ftp sites. You can telnet to an archie
server, and have it search the database. Each site is updated every 30
days, so very recently posted programs might not be listed yet.
To use archie, you need to telnet to one of the archie server sites, which
are:
archie.rutgers.edu
archie.sura.net
archie.unl.edu
archie.ans.net
archie.mcgill.ca
(thanks to O'Reilly's Internet book for this list)
Use the login name "archie" and nothing as your password. Here is a
simple archie login an search:
bigbox% telnet archie.unl.edu
login: archie
password: <--just hit return, not like anonomous-ftp
unl-archie> find linkmap
# Search type: sub.
# Your queue position: 2
# Estimated time for completion: 00:24
working... -
Host gatekeeper.dec.com (16.1.0.2)
Last updated 21:04 9 Apr 1994
Location: /contrib/src/pa/m3-2.07/src/driver/boot-DS3100
FILE -rw-r--r-- 4000 bytes 23:00 2 Jun 1992 M3LinkMap_i.c
FILE -rw-r--r-- 14027 bytes 23:00 2 Jun 1992 M3LinkMap_m.c
Location: /contrib/src/pa/m3-2.07/src/driver/linker/src
FILE -rw-r--r-- 1307 bytes 00:00 4 Dec 1991 M3LinkMap.i3
FILE -rw-r--r-- 3078 bytes 00:00 4 Dec 1991 M3LinkMap.m3
unl-archie>
Unfortunately, these linkmap programs have nothing to do with
Lathrop and Ott's linkage package. Most gene-linkage programs are
not on archie-registered ftp sites.
I have an ftp site with gene-linkage programs/utilities on it. How do I
get registered with the archie servers? [rootd;15may94]
send email to archie-admin at bunyip.com with the domain-name of
the ftp site and the email address of the administrator. If you are the
administrator of the ftp-site identify yourself as such.
What gopher sites have useful genetic-linkage information?
[rootd;21may94]
gopher.gdb.org has background information on the human
genome project, and archives of the "Human Genome News"
newsletter.
ftp.bio.indiana.edu is also a gopher site which can access
genbank It also has a link to the genethon gopher site.
gopher.genethon.fr is the genethon gopher site.
ftp.nih.gov is the National Institute of Health gopher. It can
access genbank, as well as other stuff.
gopher.nih.gov is also a gopher site which can access genbank
gopher.chlc.org has all information released by the cooperative
human linkage center.
larry.pathology.washington.edu 70 (I think that's a port
number) has human and mouse standard idiograms. The
idiograms are useful for making illustrations for gene mapping,
i.e. physical, and for constructing abnormal chromosome
illustrations, like translocations, deletions, etc. The PostScript
versions produce high quality output - can be sent to lino for
publication figures. The PostScript idiograms can be
manipulated band-by-band with illustration software such as
Adobe Illustrator, Aldus FreeHand, Canvas, Altsys Virtuoso,
etc.
megasun.bch.umontreal.ca has information on conferences,
and other stuff in:
--> 5. Computational Molecular Biology- programs, documents, help/
--> 14. Upcoming-Conferences/
What books are helpful when learning about genetic linkage analysis?
[rootd;21may94]
Jurg Ott's Analysis of Human Genetic Linkage is THE work in this
area, It is available from Johns Hopkins University Press ($47.50)
J.D. Terwilliger & J. Ott, "Handbook of Human Genetic Linkage,"
Johns Hopkins University Press, 1994, $60. It grew out of the handouts
for the linkage courses and provides detailed instructions on how to use
the LINKAGE (and some other programs) on a PC.
Guide to Human Genome Computing, edited by Martin J. Bishop, and
published by Academic Press (1994). It is very internet-oriented. The
first chapter talks about ftp sites, etc. and Chapter 3 is dedicated to
linkage analysis.($40)
E.A. Thompson: "Pedigree Analysis in Human Genetics", Johns
Hopkins University Press, Baltimore and London, 1986 ($35).
K.E. Davies (editor): "Human Genetic Diseases - A Practical
Approach". IRL Press, Oxford England and Washington, D.C., 1986
($25, softbound; $40, hardbound).
Muin J Khoury, Terri H Beaty, Bernice H Cohen. Fundamentals of
Genetic epidemiology. Oxford University Press 1993, Monographs in
epidemiology and biostatistics, Volume 19. "A good introductory book
with 339 pages (att:several mistakes)"
Please send me other suggestions.
What genetic-linkage databases are available on the internet?
[rootd;21may94][timt;09June94][rootd;19nov94]
medline is a database for searching for articles in journals. If your site
is a member of NorthWestNet, you can get to medline using telnet. Just
telnet to uwin.u.washington.edu and go into the library databases. It can
even email you the output if you wish! Many libraries and many
internet service providers have medline services online. Some
interfaces are better than others (we don't even bother using the one at
OHSU--it's too painful...) Your local library can probably supply you
with information.
[cgochiku;2Aug94] posted this:
For those of you out there with Macs who use MEDLINE and would
like a way to put those text files of downloaded references into a
database, check out medline-hc.sit in the Stanford archives. It is a
hypercard stack I wrote that allows fast importing of references,
including the abstracts. The file is at sumex-aim.stanford.edu
/info-mac/sci/medline-hc
Victor McKusick wrote a book: Mendelian Inheritance in Man. It is
continuously updated online at Johns-Hopkins University (making it
online-MIM or OMIM). Combined with the Genome- Data-Base, it
is available via ftp at ftp.gdb.org You need to get an account. Send
email to help at gdb.org for information. After you get an account, the
telnet address is gdb.org The GDB www address is gdbwww.gdb.org,
which has a useful but restricted version of GDB available.
Here's an old workshop announcement that might be useful:
IMPLEMENTATION OF THE INTEGRATED GENOMIC DATABASE
Organised by The Biocomputing Centre at DKFZ
Heidelberg 13-14th October 1994
The Integrated Genomic Database (IGD) is an international project to
develop an information management system for human genome researchers
which interconnects
existing molecular biology databases and analysis tools.
IGD is designed as a network system based on a client/server architecture.
With
regard to the origin and scope of data, the system can be subdivided into
three
levels: 1) resource databases which contribute data 2) target database
servers
which manage the integrated data 3) front-end clients which manage data
locally
to the user.
Users need to install the IGD front-end on a local workstation for
interacting
with the IGD system. The most important parts of the front-end are the
local
database manager and the interfaces to communication and analysis. Users
can
query the IGD servers and download the resulting data into their local
database,where it can be manipulated and analysed. Private data and
analysis results may
also be deposited into the local database.
Registration of the workshop: 12th October, 18.00-20.00
For further information and details of accommodation please contact:
Mrs. Anke Retzmann
Dept. of Molecular Biophysics
Im Neuenheimer Feld 280
69120 Heidelberg
Germany
Tel.: +49-6221-422372
Fax.: +49-6221-422333
E-mail: a.retzmann at dkfz-heidelberg.de
What is WWW? [rootd;16may94]
WWW stands for world-wide-web. People set up www servers
(similar to anonymous ftp servers) that you can browse through. The
webspinners (people who set up web sites) include "links" to other
related sites. All you have to do is click a mouse-button on the link,
and you will immediately go to the other site. The CEPH www site,
for example, has a link to the genethon www site. This makes it very
easy for you to get related information. My favorite www site has the
before-repair and after-repair Hubble telescope pictures side-by-side.
What is Mosaic? [rootd;16may94]
Written by NCSA (the National Center for Supercomputing
Applications) this program lets you look through www sites. It can
spawn viewers to look at graphical data, output sound data on your
computer's speaker (if your computer has a speaker), save your
"favorite" www sites between sessions, and access automated
www-search-engines (which search the www for you--similar to
archie).
What is Lynx? [rootd;19nov94]
Lynx is another world-wide-web browser (like Mosaic). Lynx,