FastAlert - The FastA Best Scores Alerting System
(C) 1995 BioComputing Basel
This is to announce the availability of the FastAlert v0.8 query software
that allows non-commercial users to get access to the FastAlert server
running on a DEC/AXP 4000 at BioComputing Basel, EMBnet Switzerland (other
sites offering this service may follow). The service is freely accessible
to EMBnet members and Swiss Universities. Requests from other (non-
commercial) users will only be handled if they remain within reasonable
limits. Note that the service is still in an experimental phase.
WHAT IS IT, WHY SHOULD YOU USE IT
In the view of daily growing protein and nucleotide sequence databases,
scientists have to re-evaluate their sequence search findings periodically.
FastAlert is an automatic search system that performs periodic scans of
permanently updated protein or DNA databanks with user-provided query
sequences. The name of the system is derived from FastA [1] as it
incorporates part of W.R. Pearson's FastA package. FastAlert is built on
top of the HASSLE v5 [2,3] network protocol which handles the resource
discovery in a fully transparent way. Thus, the user does not need to know
the locality of the FastAlert service provider. The user's query sequence
is registered automatically at the nearest server which is able to handle
the request. Access control is accomplished by a host-bound authorization
mechanism. Upon registration the sequence is periodically searched against
the appropriate set of databases. The results, so-called FastAlert reports,
are delivered periodically via electronic mail. The reports contain the
FastA [4] best-scores list and the similarity statistics for each entry
listed. Initially, a full report is generated. Subsequent reports contain
only those entries of the FastA best-scores list which did not appear in
the previous search. The probability estimates for the similarity scores as
produced by FastA are calculated using the prdf program (W.R. Pearson and
D. Lipman) which is an improved version of RDF [5]. Basically, prdf
compares two sequences by calculating initial and optimal similarity scores
and then repeatedly shuffles the second sequence and calculates the
similarity scores again. Extreme value distributions [6] are then fit to
the distributions of the scores. This allows to estimate the probability
that each of the unshuffled sequence scores would be obtained by chance.
SUPPORTED PLATFORMS
FastAlert requires direct access to the Internet (i.e. the computer running
FastAlert must have an IP address). Precompiled binaries for the following
platforms are available: Macintosh, MS-Windows 3.1x, OS/2 Warp, Irix 4x,
Irix 5x, OSF/1 V2, AXP/OpenVMS V1.5. System requirements:
-Macintosh: System 7.x/MacTCP
-MS-Windows 3.1x: TCP/IP package (Winsock-compliant)
-OS/2 Warp: IBM TCP/IP 2.x (so32dll, tcp32dll)
-Irix, OSF/1: X11/Motif
-AXP/OpenVMS: DecWindows, UCX
AVAILABILITY
The FastAlert v0.8 query software is available via anonymous FTP at:
nic.switch.ch/mirror/embnet-ch/bioftp-sw/fastalert
ACKNOWLEDGMENTS
Thanks to Bill Pearson for comments on an earlier version of this program.
IBM Switzerland provided help with the program development on OS/2.
Financial support was provided by the University of Basel, the Swiss
National Science Foundation and the Bundesamt fuer Bildung und
Wissenschaft.
REFERENCES
[1] Pearson, W.R. and Lipman, D.J. Rapid and Sensitive Protein Similarity
Searches. Science 227(1985), 1435-1441.
[2] Doelz, R. Hierarchical Access System for Sequence Libraries in Europe
(HASSLE): a tool to access sequence databases remotely. CABIOS 10(1994),
31-34.
[3] Doelz, R., Eggenberger, F. and Wadley, C. Biocomputing on a Server
Network. embnet.news 1(1994), 6-8 (electronic version available via World-
Wide Web at: http://www.ch.embnet.org/embnet.news/info.html).
[4] Genetics Computer Group, Inc. Program Manual for the Wisconsin Package,
Version 8 (1994).
[5] Pearson, W.R. and Lipman, D.J. Improved tools for biological sequence
comparison. Proc. Natl. Acad. Sci. 85(1988), 2444-2448.
[6] Altschul, S.F. et al. Issues in searching molecular sequence databases.
Nature Genetics 6(1994), 119-129.
--
F. Eggenberger, R. Doelz, N. Redaschi
BioComputing, Biozentrum
University of Basel
Switzerland
embnet at comp.bioz.unibas.chhttp://www.ch.embnet.org/