In <11JUN199213074770 at seqvax.caltech.edu> mathog at seqvax.caltech.edu writes:
> Situation: you have multiple peptide sequences _from the same
> protein_ , but their order is not known. Each sequence is short (say
> 6-10 aa). What database search tool, if any, can accept this sort of
> data?
>> I already did a trial run with the Drosophila white gene, where I took
> three peptide chunks and made one test sequence out of them (like 61-70 +
> 101-110 + 151-160 = sequence, although I don't recall the exact numbers).
> I ran this through Genbank's FASTA, but it only matched the best subregion
> and did not align (with gaps) the other two. I sent the same thing to
> BLAZE, but it hasn't answered yet, so no info on that.
>> Any suggestions?
>> David Mathog
>mathog at seqvax.caltech.edu> manager, sequence analysis facility, biology division, Caltech
I hear that FastA at GenBank is really busy these days, so they might not like
this suggestion, but why not submit a separate search with each peptide rather
than with them concatenated? FastA doesn't introduce huge gaps, as might be
necessary to align your peptides along a DB entry. Further, when concatenated
the peptides have been ordered and if that order doesn't match that in a
homologous DB entry, FastA wouldn't make 'loop back' type of alignment.
Admittedly, the multiple search approach requires that you compare the outputs
to find if each peptide is aligned with the same DB entry, but that may not be
too terrible a task.
As an alternative, if you have access to the GCG package, use the FindPatterns
program. You can create a patterns file containing all three peptides and have
the program find matches in a database. No scoring for homologous
substitutions, however, and you will still have to look throught the output for
a hit for each peptide in the same DB sequence.
MJW
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
/ Michael J. Weise, Ph.D. \ Univ.of Ga. BioScience Computing Facility \
( weise at bscf.uga.edu \ Dept.of Genetics UGa, Athens GA 30602 )
\ _ _ _'Tis_only_me_speak'n._ _\_ _ _ _ _ _ _ (706) 542-1409_ _ _ _ _ _ _ /