IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

5000+ seqs in GCG...

Peter Rice pmr at unst.sanger.ac.uk
Fri Mar 3 08:30:59 EST 1995


In article <1995Mar3.112658.14868 at reks.uia.ac.be> przemko at reks.uia.ac.be (Przemko) writes:
>   This is a question to all of you GCG experts, die-hards, gurus etc...
>   I would like to do a multiple sequence alignement on >5000 seqs (short). I need 
>   this to do evolutionalry analysis. GCG needs a .msf format so now is the 
>   question.
>   How can I specify that number of sequences for the pileup program WITHOUT typing 
>   all the names in my favorite text editor (pico NOT vi...). 

Firstly, remember that GCG can convert other sequence formats. You must
already have these sequences in some format, so for example if they are
in one file in PIR format you could try frompir to create GCG format
then use *.seq or whatever to read them in.

You will also probably have to change the parameters in the PileUp code
to allow so many sequences - for example make a new version (under another
name) that allows more sequences but limits their length.

Oh yes, and don't use WPI. Just putting *.seq on the command line is
*so* much easier.

But having done that, how do you plan to edit the alignment you get?

You could also try other alignment programs, CLUSTALW for example.

If you have an account on the Belgian EMBnet node, I'm sure they can help
you too.

--
------------------------------------------------------------------------
Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr    | England



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net