IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

nucleotide consensus search

Paul Roy proy at rsvs.ulaval.ca
Wed Jun 25 09:34:53 EST 1997


On 24 Jun 1997, Nathan Weyand wrote:

> Hi all,
>   I have a question about searching for a short nucleotide consensus
> sequence.  The sequence is only 10 nucleotides long and represents a
> protein binding site with in a promoter.  Specifically, I want to search
> the E. coli genome for this consensus sequence.  I have tried a fasta
> and blastn search in GCG without success.  Is it possible to search for
> such a short consensus in GCG databases such as - [nr n Non-redundant
> GenBank+EMBL+DDBJ+PDB sequences]?

Fasta and blastn probably won't work on such short sequences.  One
possibility is FINDPATTERNS permitting, say, 2 mismatches (there will
probably be too many "hits" with 3).  However, if you have several of
these sequences and if the contribution of the individual bases is unequal
(e.g. the T1, A2 and T6 in the -10 box are more conserved than the others)
then you can align the known sequences (make individual sequence files and
then use PILEUP; or enter them directly using LINEUP).  You can then use
PROFILEMAKE and then PROFILESEARCH.  These latter two programs are meant
for proteins but can be gotten to work with DNA sequences - don't forget
to use  -MATRix=profiledna.cmp  with PROFILEMAKE.  I have used these
programs successfully to hunt for consensus sequences at the ends of
"59-base elements" in integrons (see Fig. 7 of Collis and Hall, Molecular
Microbiology 6: 2875-2885 (1992).

******************************************************************************

 Paul H. Roy                             Phone:  +1 418 654 2705
 Departement de biochimie,FSG            FAX:    +1 418 654 2715
 Universite Laval                        E-mail: proy at rsvs.ulaval.ca
 Quebec, QC  G1K 7P4
 CANADA

******************************************************************************
     

On 24 Jun 1997, Nathan Weyand wrote:

> Hi all,
>   I have a question about searching for a short nucleotide consensus 
> sequence.  The sequence is only 10 nucleotides long and represents a 
> protein binding site with in a promoter.  Specifically, I want to search 
> the E. coli genome for this consensus sequence.  I have tried a fasta 
> and blastn search in GCG without success.  Is it possible to search for 
> such a short consensus in GCG databases such as - [nr n Non-redundant 
> GenBank+EMBL+DDBJ+PDB sequences]?  Does anybody have any suggestions on 
> what settings I need to adjust for a successful search or whether web 
> resources exist that can be used for my problem?
> 
> Any suggestions would be welcome! Thanks in advance for any help you can 
> offer.  I will post a summary of advice I get that leads me in the 
> right direction.
> 
> sincerely,
> 
> Nate Weyand
> email:  Nate_Weyand at hlthsci.med.utah.edu
> 
> 





More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net