In article <199609151815.NAA17395 at mimer.scs.uiuc.edu>,
cole at MIMER.SCS.UIUC.EDU (Ronald N. Cole) wrote:
> Hi,
> I am faced with a simple, but labor intensive task. I have a large
> list of pdb files (466 to be exact). I would like to turn this list into a
> list of accession numbers. I then want to do a pileup on all of these
> sequences.
I would be very interested to learn your solution to this problem. I have
figured out a very clumsy workaround. First, I edit a list of seq. names
(produced by FASTA/NoHis/NoAlign) to get a pure list of names with no extra
text, then I use DATASET to turn the list into a GCG data library. Then I
use TOFASTA to create a concatenated list of all sequences. Then I use
CLUSTAL to do the alignment.
I have not been able to use PILEUP on more than about 20 sequences of varying
length due to the 2000 gap limitation. The only way around this is to limit
each seq. to the region that will align well (begin and end #'s).
>> Does anyone know of an automated or even semi-automated way of doing
> this? I can write scripts or do any programming necessary. I just want to
> know the best way to approach this.
>>> Thanks in Advance,
> Skip Cole
>> **************************************************************************
> * Skip Cole (aka Ron Cole) * Calculation performed; *
> * cole at mimer.scs.uiuc.edu * Fast, Reliable or Meaningful. *
> * (217)355-5308 * Pick any 2 *
> **************************************************************************
--
Stuart M. Brown, Molecular Biology Consultant
NYU-MC Research Computing Resource, Dept. of Cell Biology
550 First Ave, New York, NY 10016
Phone: (212)263-7689 FAX: (212)263-8139