Newsgroups: bionet.software.acedb
Subject: ACEDB Genome Database Software FAQ
Followup-To: bionet.software.acedb
Reply-To: matthews at greengenes.cit.cornell.edu
Distribution: world
Organization: USDA-ARS, Dept. Plant Breeding, Cornell University
Summary: Frequently Asked Questions about the genome database software ACEDB.
URL: htp://ars-genome.cornell.edu/acedocs/acedbfaq.html
Archive-name: acedb-faq
Last-modified: 19 May 2001
Version: 1.44
_________________________________________________________________
ACEDB FAQ
_________________________________________________________________
Curated by: Dave Matthews
_________________________________________________________________
Frequently Asked Questions about ACEDB
* Q0 : What is ACEDB?
* Q1 : ! What is the current version of ACEDB?
* Q2 : ! Where can I get ACEDB?
* Q3 : What hardware/software do I need to run ACEDB?
* Q4 : Can ACEDB be networked?
* Q5 : What documentation exists for ACEDB?
* Q6 : Can I subscribe to the ACEDB newsgroup by mail?
* Q7 : Is there a repository of software tools for ACEDB curators?
* Q8 : When and where is the next ACEDB Workshop?
* Q9 : How does ACEDB compare to commercial relational DBMS's?
* Q10 : How should ACEDB be cited?
* Q11 : What ACEDB databases exist?
* Q12 : Who prepared this document & where is the current version?
Questions marked with '!' have substantially changed answers since the
last update of the FAQ.
_________________________________________________________________
Q0: What is ACEDB?
A0:
ACEDB is an acronym for "A Caenorhabditis elegans Database". It can
refer to a database and data concerning the nematode C. elegans, or to
the database software alone. This document is concerned primarily with
the latter meaning. ACEDB is being adapted by many groups to organize
molecular biology data about the genomes of diverse species.
ACEDB allows for automatic cross-referencing of items during loading
and allows for hypertextual navigation of the links using a graphical
user interface and mouse. Certain special purpose graphical displays
have been integrated into the software. These reflect the needs of
molecular biologists in constructing genetic and physical maps of
genomes.
ACEDB was written and developed by Richard Durbin (MRC LMB Cambridge,
England) and Jean Thierry-Mieg (CNRS, Montpellier, France), beginning
in 1989. It is written in the C programming language and uses the X11
windowing system to provide a platform independent graphical user
interface. The source code is publicly available. Durbin &
Thierry-Mieg continue to develop the system, with contributions from
other groups.
A description by Durbin & Thierry-Mieg:
ACEDB does not use an underlying relational database schema, but a
system we wrote ourselves in which data are stored in objects that
belong in classes. This is nevertheless a general database management
system using caches, session control, and a powerful query language.
Typical objects are clones, genes, alleles, papers, sequences, etc.
Each object is stored as a tree, following a hierarchical structure
for the class (called the "model"). Maps are derived from data stored
in tree objects, but precomputed and stored as tables for efficiency.
The system of models allows flexibility and efficiency of storage
--missing data are not stored. A major advantage is that the models
can be extended and refined without invalidating an existing database.
Comments can be added to any node of an object.
_________________________________________________________________
Q1: What is the current version of the ACEDB software?
A1:
New! The current version for Unix and Windows is 4_9a, 26 Apr 01.
Updates are released ca. monthly at
http://www.acedb.org/Software/Downloads/supported.shtml.
These updates don't usually have different version numbers so please
note the dates.
The current version for Macintosh is 4.1b1, August 1995.
WWW interfaces: (See "Can ACEDB be networked?".)
* New! AcePerl -- version 1.72, dated sometime since 1.68 which was
15 Oct 2000.
* AceBrowser -- version 2.10, September 1999
* webace2K (Cornell) -- May 2000
* webace2 (Sanger) -- version 2.0a5, March 1998
* CITA: CORBA Interface To ACEDB (UK Cropnet) -- May 2000
* Jade: Java for ACEDB
+ version 1.0j, Apr 1998
+ New! jadex, 2001
_________________________________________________________________
Q2: Where can I get ACEDB?
A2:
Source code and Unix and Windows binaries are available at:
* New! http://www.acedb.org/Software/Downloads/supported.shtml
* ncbi.nlm.nih.gov in repository/acedb (not up to date at present)
* http://alpha.crbm.cnrs-mop.fr/acedb/distrib/ (offline at present)
MacAce, from Frank Eeckman, Cyrus Harmon and Richard Durbin:
(Note: The authors are not currently able to support MacAce. Latest
version was 4.1b1.)
* ftp.sanger.ac.uk in pub/acedb/macace
* ncbi.nlm.nih.gov in repository/acedb/macace
_________________________________________________________________
Q3: What hardware/software do I need to run ACEDB?
A3:
The software is available in binary (pre-compiled) format for a
variety of machines.
* Unix:
+ Sun/SunOS 4.x
+ Sun/Solaris
+ DEC DECstation3100, 5100 etc.
+ DEC Alpha/OSF-1
+ Silicon Graphics Iris series 4, 5, 6
+ IBM RS-6000
+ PC 386/486/Pentium with Linux
+ NEC EWS4800
+ NeXT: contact Patrick Phillips at University of Texas,
NeXTmail: patrick at wbar.uta.edu email: phil at decster.uta.edu
+ There exist, or have existed, ports onto Alliant, Hewlett-
Packard, Convex. You may have to contact the developer
responsible for the port to make these real.
* Windows 95/98/NT/2000
* Macintosh (not currently supported)
The software is also available as source code, so you may be able to
get it working on any machine.
Memory requirements (from Richard Durbin, aug 97)
The amount of memory you require for ACEDB depends very much on how
big the database is (i.e. the disk space used by the database/
subdirectory). Our rule of thumb is that one typically uses 5-10Mb
plus up to 10% of the disk space size of the database. So with a 200Mb
database perhaps 25Mb memory, and with a 500Mb database (e.g. the C.
elegans one) up to 50-60Mb. In fact for short sessions less memory is
used -- it is only when all classes are explored, or for example when
parsing big files that these amounts of memory get used.
_________________________________________________________________
Q4: Can ACEDB be networked?
A4:
ACEDB Client / Server Computing (from Doug Bigwood, aug97)
There are several client/server models for ACEDB computing and several
more are in development. The start of the ACEDB client/server age
began with the inclusion of aceclient and aceserver in version 4.0.
These are C - based and use the RPC protocol for communication. These
executables can be made from the standard ACEDB distributions.
Starting in version 4.5 an xaceclient is also included with ACEDB.
Xaceclient provides remote read/write access to an aceserver while
providing the user with the same X displays that are found in xace. To
use it, you create an empty database with the appropriate models and
start xaceclient. It will automatically retrieve data from the server
declared in wspec/server.wrm (the Montpellier server in the
distribution server.wrm). The data will be saved locally and can then
be viewed with a normal xace.
A perl extension which provides aceclient functionality to Perl 5.x
was developed at ACE95. The files necessary for this perl extension
are now (ACEDB 4.5 and later) included in the wrpc directory of the
ACEDB directory hierarchy. Documentation about how to extend perl is
found at
http://ars-genome.cornell.edu/acedocs/ace97/perlace/perlacecl.html.
WWWAce and its successor webace were developed to provide a World Wide
Web interface for ACEDB. Webace instructions can be found at
http://ars-genome.cornell.edu/acedocs/webace.html, and
http://ars-genome.cornell.edu/acedocs/ace97/webace.html and the
program itself at
ftp://ars-genome.cornell.edu/pub/tools/webace.tar.gz.
A Java-based client called Jade allows communication via sockets to an
aceserver. Jade installation instructions and information on
downloading can be found at http://stein.cshl.org/jade/.
There are now development efforts underway to provide additional
client/server functionality to ACEDB including a CORBA server and
socket-based communications. These will likely be included in future
versions of ACEDB. A new C library interface to ACEDB internals will
greatly ease the development of new clients and servers that will
support additional protocols.
Subsequent developments (from Dave Matthews, jul00)
A new version of webace, sometimes called webace2, was developed at
the Sanger Centre. It makes use of the new gifaceserver instead of
aceserver to improve interactive response of the graphical displays,
Javascript, Java, and a new Aceclient.pm module which can be installed
into Perl without recompiling. It also supports the ACEDB ?URL class.
The home page is at http://webace.sanger.ac.uk/. Its authors currently
consider it "deprecated", preferring AceBrowser.
AcePerl, from Lincoln Stein, is an object-oriented Perl interface to
ACEDB. It can connect to remote ACEDB databases, perform queries,
fetch ACE objects, and update databases. The programmer's API is
compatible with the Jade Java API. Home page at
http://stein.cshl.org/AcePerl/.
AceBrowser, from Lincoln Stein, is a ready-to-use WWW gateway to ACEDB
databases built on AcePerl. It has most of the functionality of
webace. http://stein.cshl.org/AcePerl/AceBrowser/.
webace2K is an enhancement of webace2, from Maria Nemchuk.
http://ars-genome.cornell.edu/webace/webace_install.html
CITA is a CORBA Interface To ACEDB, from UK CropNet.
http://jic-bioinfo.bbsrc.ac.uk/BrassicaDB/CITA/
_________________________________________________________________
Q5: What documentation exists for ACEDB?
A5:
At the Sanger Centre, www.acedb.org
* Current documentation and news, including:
* Proceedings of the ACEDB2000 Conference
* A "quick quide" tutorial for users, http://www.acedb.org/Tutorial/
* Archive of the monthly ACEDB User Group Newsletter from Ed
Griffiths, http://www.acedb.org/winfo/Newsletters/
* WebDDTS for ACEDB bug reporting and tracking
* WWW copies of the ACEDB online help (see below),
http://www.acedb.org/Software/whelp/TOC.html
In the ACEDB Documentation Library, http://ars-genome.cornell.edu/acedocs/
* Archive of many still interesting documents, including:
* The original documentation from the developers Durbin &
Thierry-Mieg, ca. 1992.
+ acedb -- A C. elegans Database: I. Users' Guide
+ acedb -- A C. elegans Database: II. Installation Guide
+ acedb -- A C. elegans Database: III. Configuration Guide
+ acedb -- A C. elegans Database: Syntactic Definitions for the
ACEDB Data Base Manager
* Tutorials and technical guides, for users, curators and
programmers
* Documentation written at the ACEDB Workshops
* Documents in Postscript, wordprocessor and other non-html formats,
at ftp://ars-genome.cornell.edu/pub/acedocs/.
* Some curator tools
* A selection of models.wrm files from various databases
* SampleDB, a sample database to demonstrate some ACEDB features,
1995
* This FAQ in html format
Other
* The ACEDB online help, available from the Help buttons in xace. A
context-sensitive hypertext reader for the contents of the whelp
directory of the ACEDB software distribution.
* Contents of the wdoc, wtools, and wscripts directories.
* Searchable archives of the bionet.software.acedb newsgroup
+ http://www.bio.net/hypermail/ACEDB/, from BIOSCI
+ http://genome-www.stanford.edu/cgi-bin/biosci_acedb, from the
Saccharomyces Genome Database site
* ACEDB User's Guide in Japanese, from Tohru Sano, NEC,
sano at exp.cl.nec.co.jp,
http://www.cbi.or.jp/~sano/. (Postscript at
http://www.labs.nec.co.jp/ . Follow the prompts to register and
"download the software".)
* Paper publications
+ Cherry, J.M., Cartinhour, S.W., and Goodman, H.M. (1992).
AAtDB, an Arabidopsis thaliana database. Plant Molecular
Biology Reporter 10: 308-309, 409-410.
+ Cherry, J.M. and Cartinhour, S.W. (1994). ACEDB, A tool for
biological information. Pp. 347-356 in: Automated DNA
Sequencing and Analysis, M. Adams, C. Fields, and C. Venter
(Eds.). Academic Press. Online version:
http://ars-genome.cornell.edu/acedocs/overview.html.
+ Dunham, I., Durbin, R., Mieg, J-T & Bentley, D.R. (1994).
Physical mapping projects and ACEDB. Pp. 111-158 in: Guide to
Human Genome Computing. Bishop, M.J (Ed.). Academic Press.
_________________________________________________________________
Q6: Can I subscribe to the ACEDB newsgroup by mail?
A6:
Yes! Just send the message "subscribe acedb" to
biosci-server at net.bio.net.
You can also post to the newsgroup by mail, write to
acedb at net.bio.net.
Or you can access it with a standard newsreader like rn or tin at
bionet.software.acedb, or with a WWW browser at
news:bionet.software.acedb.
The articles are archived by BIOSCI at
http://www.bio.net/archives.html and by Mike Cherry at
http://genome-www.stanford.edu/cgi-bin/biosci_acedb. Both archives are
indexed for searching. This is the place to find the Questions that
really are Frequently Asked!
_________________________________________________________________
Q7 : Is there a repository of software tools for ACEDB curators?
A7:
Not really, but there are several partial ones. The main tools
available are for converting data from other formats to .ace format.
The USDA-ARS Center for Bioinformatics and Comparative Genomics has
some useful tools at
http://ars-genome.cornell.edu/acedocs/conversion.html. Some additional
ones were contributed at the ACE97 Workshop and can be found in the
Proceedings, http://ars-genome.cornell.edu/acedocs/ace97/tools/.
Mike Cherry maintains an archive of tools at
ftp://genome-ftp.stanford.edu/pub/acedb_dev/utilities/
For a general tool for converting data to ACEDB format input files,
Joachim Baumann (joachim.baumann at informatik.uni-stuttgart.de) has
written the Perl program TextConvert, available at
ftp.informatic.uni.stuttgart.de/pub/DART/.
_________________________________________________________________
Q8: When and where is the next ACEDB Workshop?
A8:
The ACEDB2000 Workshop was held June 10-16 at Simon Fraser University,
B.C., Canada. The Proceedings are at
http://www.acedb.org/winfo/Conferences/acedb2000/.
The ACE97 Conference and Workshop was held July 27 - August 9 at
Cornell University, Ithaca, New York, USA. See the ACE97 Proceedings
Page, http://ars-genome.cornell.edu/acedocs/ace97/proceedings.html for
the results.
The Proceedings from the May 1995 ACEDB Conference are available at
http://ars-genome.cornell.edu/acedocs/ace95/. A final summary report
is available at
http://ars-genome.cornell.edu/acedocs/ace95/ace95.final.html. Also
available online are collections of snapshots taken during the
conference by Frank Eeckman and by Dave Matthews.
For pictures of the ACEDB '94 Workshop in St. Matthieu de Treviers,
see the online collections:
* by Mike Cherry at
http://genome-www.stanford.edu/~cherry/pics/acedb-94.pics.html ;
* by John Morris at
http://weeds.mgh.harvard.edu/ace94/ace94.image.html ;
* and by Brad Sherman at ftp://s27w007.pswfs.gov/ACEDB/ace94pix.html
_________________________________________________________________
Q9: How does ACEDB compare to commercial relational DBMS's?
A9:
From Jean Thierry-Mieg, 4/97:
Obviously, i have a biased opinion, but i would say that acedb is to
be recommended if the following criteria are met:
1) A very complex schema, that cannot be developed at once, but will
need continuous refinement in parallel with the accumulation of the
data
2) The type of questions that will be asked are rather complex, with
rather fuzzy answers, that one tries to refine progressively. The
acedb browsing capacities are useful in this case and have no
equivalent in a relational dbms
______________
I would rather recommend sybase in the following case
1) Simple schema, that can be designed from the start and does not
contain too many n.n relations and does not need recursivity
2) The type of questions that will be asked is: succession of
de-correlated simple questions with simple answers
____________________
Within this context, i would then list the following goodies of acedb:
1) The ace file format, which is a powerful system to prepare and
exchange data between data curators.
2) The existence of an easy graphic browsing interface
3) The availability of a biology-layer, if the application is about
genetics
4) Portability (any unix machine), mac (with some limitations),
windows (in development) and price (ace is a freeware). This implies
that you can actually redistribute the complete system, say on a CD,
something impossible with sybase.
5) Ease of use, i seriously believe that ace is much easier to
configure and use than sybase.
_____________________
Finally one should consider the following question: concurrency.
Sybase has a well designed transaction system, which will allow roll
backs and refined lockings. This is essential for an application like
a booking agency, with many users in simultaneous write access.
Ace is much simpler minded. The graphic acedb creates a global lock
allowing a single user with write access at the time, and the
modifications are not echoed to the other "read access" users in real
time.
The non graphic client server system allows parallel downloading of
data by many users, it is intended for example for collection of
robots sending their independent data in parallel. This is now well
tested.
A graphic client system is being developed and now runs in our hands,
but is not yet released.
--
Therefore, if you do need real time simultaneous write access with
partial locks, and roll backs, use sybase/oracle
________________
Last issue is speed and quantities of data. In principle,
sybase/oracle is unlimited, whereas acedb needs to keep around 5-10%
of the data in ram. But this apparent difference is misleading.
On a 32 Meg machine, you can run ace with around 300.000 objects with
a complex schema at high speed. With say 1M objects, you will need
more memory or the performance would totally degrade because of
swapping. However, this is really a lot of data.
On a similar machine, your sybase oracle will work with that amount or
more data only if you do not perform too many joins. This implies that
you are asking simple questions from a simple schema which was indeed
our first criterion to choose sybase. If you start asking complex
questions and make joins, acedb is actually much more powerful.
During tests run on a big dec alpha server by Otto Ritter in decembre
1995 on several million biological objects with a complex schema,
acedb was about 10 times faster than sybase, both to load the data and
to answer queries.
I would therefore conclude that the quantity of data is not a
criterion pushing one way or the other, it is the complexity of the
schema that matters.
_________________________________________________________________
Q10: How should ACEDB be cited?
A10:
From the distribution:
We realize that we have not yet published any "real" paper on ACEDB.
We consider however that anonymous ftp servers are a form of
publication. We would appreciate if users of ACEDB could quote:
Richard Durbin and Jean Thierry Mieg (1991-). A C. elegans Database.
Documentation, code and data available from anonymous FTP servers at
lirmm.lirmm.fr, cele.mrc-lmb.cam.ac.uk and ncbi.nlm.nih.gov.
Papers involved in database development could quote more precisely:
I. Users' Guide. Included as part of the ACEDB distribution kit,
II. Installation Guide. Included as part of the ACEDB distribution
III. Configuration Guide. Included as part of the ACEDB distribution
and the preprintkit available via anonymous ftp. Jean Thierry-Mieg and
Richard Durbin (1992). Syntactic Definitions for the ACEDB Data Base
Manager. Included as part of the ACEDB distribution.
--Jean and Richard.
_________________________________________________________________
Q11: What ACEDB databases exist?
A11:
Too many to maintain an up-to-date listing. A list as of mid-1998 is
available at http://ars-genome.cornell.edu/acedocs/acedbfaq.dbs.html
A repository of many of these databases is maintained by CBCG, both
for anonymous ftp at ftp://ars-genome.cornell.edu/pub and for WWW
access via Webace at http://ars-genome.cornell.edu/.
_________________________________________________________________
Q12: Who prepared this document & where is the current version?
A12:
This document is posted monthly to the BIOSCI newsgroup
bionet.software.acedb.
The WWW version is at
http://ars-genome.cornell.edu/acedocs/acedbfaq.html.
This FAQ was created and maintained from 1993 - 1996 by Bradley K.
Sherman. Major contributions in getting it off the ground were made by
Mike Cherry, John McCarthy, and Doug Bigwood. Other contributors
include:
* Lisa Lorenzen
* David Matthews
* Edie Paul
* Donn Davy
* Eric De Mund
* Sam Cartinhour
It is currently maintained by Dave Matthews.
Please cite as:
Matthews, D.E., and B.K. Sherman, ACEDB Genome Database Software FAQ,
http://ars-genome.cornell.edu/acedocs/acedbfaq.html, 1993-2000,
approx. 30K bytes.
To add or modify information in this document, please send mail to:
matthews at greengenes.cit.cornell.edu
The GrainGenes Project is funded by the USDA ARS Plant Genome Research
Program.
_________________________________________________________________
---