In article <sc4vh4rd7vv.fsf at fes1.sanger.ac.uk>,
Keith James <kdj at fes1.sanger.ac.uk> wrote:
>>>>>> "Greg" == Greg Quinn <greg at franklin.burnham-inst.org> writes:
>> Greg> I appreciate the direction of the EMBOSS project,but I'm not
> Greg> sure that FuzzNuc, or I guess the one that I would be
> Greg> interested in, FuzzPro is the functional equivalent of
> Greg> FindPatterns; unless I missed something on the description
> Greg> page for FuzzPro, I didn't see a specific query sequence
> Greg> syntax of the kind found in FindPatterns that precisely
> Greg> allows me to specify particular residues at each position in
> Greg> the query. It's this kind of ability which I'm looking
> Greg> for....
>>Yes, you are right. Having only used it for searching using IUB
>ambiguity codes I assumed that when it was prompting for 'search
>pattern' it would accept some sort of regular expression whose syntax
>was described elsewhere in the docs.
>>I've just tried it with GCG and Unix regexp type patterns and it
>doesn't accept them. Now I know!
Try using the Prosite pattern specification.
- The standard IUPAC one-letter codes for the amino acids are used.
- The symbol `x' is used for a position where any amino acid is accepted.
- Ambiguities are indicated by listing the acceptable amino acids for a
given position, between square parentheses `[ ]'. For example: [ALT]
stands for Ala or Leu or Thr.
- Ambiguities are also indicated by listing between a pair of curly
brackets `{ }' the amino acids that are not accepted at a given
position. For example: {AM} stands for any amino acid except Ala and
Met.
- Each element in a pattern is separated from its neighbor by a `-'.
(Optional).
- Repetition of an element of the pattern can be indicated by following
that element with a numerical value or a numerical range between
parenthesis. Examples: x(3) corresponds to x-x-x, x(2,4) corresponds to
x-x or x-x-x or x-x-x-x.
- When a pattern is restricted to either the N- or C-terminal of a
sequence, that pattern either starts with a `<' symbol or respectively
ends with a `>' symbol.
- A period ends the pattern. (Optional).
Example:
[AC]-x-V-x(4)-{ED}
This pattern is translated as: [Ala or Cys]-any-Val-any-any-any-any-{any
but Glu or Asp}
Gary Williams Tel: +44 1223 494522 Fax: +44 1223 494512
mailto:G.Williams at hgmp.mrc.ac.ukhttp://www.hgmp.mrc.ac.uk/
Bioinformatics,MRC HGMP Resource Centre,Hinxton,Cambridge, CB10 1SB,UK