DAMBE (Data Analysis in Molecular Biology and Evolution) is an integrated
software package running on Windows 95 for comparative analysis of
molecular data (including nucleotide and amino acid sequence data as well
as allele frequency data). The software is available at:
http://web.hku.hk/~xxia/software/software.htm
The main features are outlined below:
--File input with commonly used DNA and protein sequence formats,
including MEGA (interleaved and sequential), Pearson/FASTA, ClustalW, IG
(IntelliGenetics), GB (GenBank), GCG single-sequence (.GCG) and
multi-sequence format(.MSF), Yang's PAML format (interleaved and
sequential, but not the one with the G option) and RST format, Kumar's
PHYLTEST format, as well as those formats understood by Don Gilbert's
READSEQ (which is included for your convenience). One particular strength
of the program lies in its versatility with the GenBank (GB) file format -
the program can automatically splice out and join CDS, exons, introns, mRNA
segments according to the FEATURES table in the GB file.
--File conversion between commonly used sequence formats.
--Saving subsets of sequences in any of commonly used format.
--Many convenient ways of manipulating nucleotide and amino acid sequences.
--Many descriptive statistics of molecular sequences.
--Generating distance matrices including Jukes and Cantor's (1969)
distance, Kimura's (1980) two-parameter distance, Tajima and Nei's (1984)
distance, Lake's (1994) paralinear distance, and an entropy-based distance
which seems to perform better in phylogenetic reconstruction than other
distances based on preliminary studies.
--Phylogenetic analyses using distance, maximum parsimony and maximum
likelihood methods; resampling statistics (bootstrap and jackknife).
--Graphically display phylogenetic trees (including consensus tree) and
topology manipulation.
--For pair-wise differences, one particular strength of the program lies in
its ability to count the differences between neighboring nodes along a
phylogenetic tree if the tree structure is provided. For example, the
BASEML program in the PAML package can reconstruct ancestral sequences when
a topology is given. The output file from BASEML, called rst by default,
contains the tree topology as well as the reconstructed sequences. DAMBE
can read the RST file directly and count pair-wise differences in
nucleotide, amino acid and codon sequences between neighboring nodes along
the phylogenetic tree.
--Fit the Poisson and negative binomial distributions to nucleotide, amino
acid, and codon substitutions along the DNA sequences.
--Computing expected pattern of codon substitution given a genetic code
(e.g., mammalian mitochondrial), with or without adjustment for codon
frequencies. All 12 different genetic codes have been implemented.
--Quantifying the extent of substitution saturation in nucleotide sequences
for one to decide whether to include or exclude certain nucleotide sites in
phylogenetic reconstruction
--Plotting amino acid properties (chemical composition along the side
chain, volume, polarity, polar requirement, hydropathy, aromaticity, and
isoelectric point) along sequences
--Computing an entropy-based measure of variability over sites and plot the
variability along the sequences.
--Special computational tools for analyzing the effect of DNA methylation
on molecular evolution of nucleotide sequences.
The strength of DAMBE is not in what it can do, but in the way it does it,
which I believe will save many researchers many hours.
Best.
Xuhua
===================================================================
Xuhua Xia | Tel: (852) 2857 8239 (lab)
Assistant Professor | Tel: (852) 2975 5629 (office)
Department of Ecology & Biodiversity| Fax: (852) 2517 6082
The University of Hong Kong | Email: xxia at hkusua.hku.hk
Pukfulam Road | WWW: http://web.hku.hk/~xxia
Hong Kong |
===================================================================