I am very pleased to announce the availability of
PRATT version 2.1
A tool for finding flexible patterns in
unaligned protein sequences
Written by:
Inge Jonassen,
Dept. of Informatics,
University of Bergen, Norway
see also: http://www.ii.uib.no/~inge/Pratt.html
Source code (ANSI C) is available under this URL, and has been
compiled and run successfully on a variety of different UNIX
systems including LINUX.
The author is very grateful to Des Higgins, John F. Collins, and Ingvar
Eidhammer for help and collaboration. Especially John F. Collins has
proposed many of the features added in this version.
Pratt is a program that allows the user to efficiently search for
patterns conserved in a set of protein sequences. It allows the user
to define the class of patterns to be searched for, and then finds
conserved patterns in this class.
The time used by the program depends on
- the set of sequences,
- the class of patterns defined,
- the minimum number of sequences a pattern is to match,
- if an alignment or a query sequence is given, and
- the greediness of the search.
Version 2.0 was the last major version of Pratt, in the 2.1 version
relatively minor things have been added/changed.
New features in version 2.1 includes:
------------------------------------
- When showing the sequence segments matching each pattern, the sequence
symbols matching non-wildcard positions (components) in the pattern,
are written in upper-case while sequence symbols matching wild-cards
are in lower-case. Also, gaps (-) are added to align the symbols
matching
each pattern component.
- Summary information about where the patterns match in the sequences is
written horizontally or vertically.
- The user can restrict where in the sequences patterns should be looked
for.
This can be useful for example if the user knows some constraints on
the
position of the patterns in one or more of the sequences.
- A few bugs in version 2.0 have been fixed in 2.1.
- Command line search control -- the user can now choose values for
all parameters that are in the menu directly from the command line.
This makes it quicker for experienced user to specify his/her search,
and also makes it easier to call Pratt from inside other programs.
- The menu has been changed - two-letter commands are used.
- When using Pratt interactively (using the menu), some summary
information
about the search parameters will be shown after the user asks the
search
to be started, and the user is given the opportunity to go back to the
menu to change parameter values.
- On-line help is available from the menu by typing "help <option>"
where
option is one of the options in the menu or help for general help
about Pratt.
- When using Pratt interactively (using the menu), some summary
information
about the search parameters will be shown after the user asks the
search
to be started, and the user is given the opportunity to go back to the
menu to change parameter values.
What was new in Version 2.0:
---------------------------
- heuristics and branch-and-bound has been implemented speeding the
pattern search significantly especially for sets of relatively similar
sequences (using special parameters the search will be guaranteed).
- A multiple sequence alignment (Clustal W format) of a subset of the
sequences can be input to Pratt, and used to restrict the search to
patterns consistent with the alignment
- A special "query" sequence (Fasta format) can be input, and Pratt will
only search for patterns matching this. Makes it convinient to use
Pratt
together with database homology search programs.
- DNA sequences can be analysed (complementary strand not included in
analysis).
Availability:
------------
The program has been implemnted in ANSI C. It has been compiled and
run on different UNIX workstations, DEC alpha, Sun, Silicon Graphics,
and also on Linux PC's (Pentium and DEC alpha).
The source code is available on:
ftp://ftp.ii.uib.no/pub/bio/Pratt/Pratt2.1.tar
For documentation and more information, see
http://www.ii.uib.no/~inge/Pratt.html
References:
----------
"Finding flexible patterns in unaligned protein sequences"
Jonassen, I., Collins, J. F., Higgins, D. G.
Protein Science (1995) 4:1587-1595.
"Efficient discovery of conserved patterns using a pattern graph."
Jonassen, I.
Submitted to CABIOS.
If you want to be informed about bug fixes, new versions etc.,
send an email to inge at ii.uib.no. Bug reports, suggestions for
improvements etc. are sent to the same adress.
Sincerely,
Inge Jonassen,
Dept. of Informatics,
University of Bergen,
HIB,
N5020 BERGEN,
Norway
email: inge at ii.uib.no