Are you familiar with the following? I don't have a personal experiance
with it yet.
asset.note (gcguser) Wed Feb 14 11:49:12 1996
/* ==================================================================
========
=
*
* PUBLIC DOMAIN NOTICE
* National Center for Biotechnology Information
*
* This software/database is a "United States Government Work" under
the
* terms of the United States Copyright Act. It was written as part
of
* the author's official duties as a United States Government
employee and
* thus cannot be copyrighted. This software is freely available to
the
* public for use. The National Library of Medicine and the U.S.
Government
* have not placed any restriction on its use or reproduction.
*
* Although reasonable efforts have been taken to ensure the accuracy
* and reliability of the software and data, the NLM and the U.S.
* Government do not and cannot warrant the performance or results
that
* may be obtained by using this software or data. The NLM and the
U.S.
* Government disclaim all warranties, express or implied, including
* warranties of performance, merchantability or fitness for any
particular
* purpose.
*
* Please cite
*
* A. F. Neuwald and P. Green (1994) "Detecting Patterns in
Protein
* Sequences", J. Mol. Biol. 239:698-712.
*
* in any work or product based on this material.
*
* The data structures used in this program are part of a
package
* of object oriented C code for molecular biological applications
* being developed by A. F. Neuwald.
*
* ===================================================================
======*
ASSET (Aligned Segment Statistical Evaluation Tool) version 1.0
Include 3 programs: asset, purge and scan. Each of these
programs require fasta formated input files.
The PURGE program removes closely related sequences from an input
file prior to running asset. This is important in order to reduce
input sequence redundancy. The command syntax for purge is:
purge <in_file> <score>
where <score> determines the maximum blosum62 relatedness score
between any two sequences in the output file (the output file is
created with the name <in_file>.b<score>). A score between 100 and
200 is recommended. The scan program scans a database for sequences
that contain motifs detected by asset. A paper describing the
details of the scan and purge programs is in preparation.
The ASSET program will produce a "scan file" of the locally aligned
segment blocks by using the -f<int> option; <int> specifies the
percentage of sequences in the input file that are required to
contain a motif before the corresponding motif block can be included
in the scan file. The scan file is given the name <in_file>.sn.