From: BORCIM::BRETT 17-MAR-1994 16:09:13.77
To: BRETT
CC:
Subj: letter to gcg
Hello. I would like to use GCG to align several protein sequences (>100) in
order to create a consensus sequence. The problem is that a lot of the
sequences are only partial. I tried to use PILEUP, but it did not handle
the sequences with internal overlap well. ie:
+++++++++
++++++++++++++++++++++++++++++++++++++
++++++++++++
++++++++++
+++++++++++++++++++++++++++++++++++++++
+++++++++++++++
++++++++++++++
The result was a blank outfile. However, when I used as input the full-length
sequences, I got a nice alignment back. So, I have been using this alignment as
a backbone to align sequences using LINEUP. The Zip routine seems to be able
to correctly place these internal sequences. However, LINEUP can only handle
30 sequences. I have been considering making several LINEUP alignments and then
aligning the consensuses I get from them. Is this a reasonable way to go? I am
afraid of misrepresenting some columns with this approach. Also, how can I use
the output from LINEUP in PRETTY? How can I extract the consensus from an .msf?
Can this consensus then go into PILEUP as a new sequence? Please try to answer
as many of these questions as you can. Thanks,
Brett Lindenbach
brett at borcim.wustl.edu