IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

Sequence editor that can delete columns?

H.J. Sluiman mbhsl at s-crim1.dl.ac.uk
Wed Mar 5 05:07:33 EST 1997

In article <5f45kd$b4h at dismay.ucs.indiana.edu>, gilbertd at bio.indiana.edu (Don Gilbert) writes:
|> P.S. on SeqPup usage to delete columns.
|> You can use it now to do what you want, using the current mask
|> functions.  If you create a mask that includes only the
|> data of interest to you, then you can choose the Sequence/Seq Masks/
|> Compress by mask command.  This will eliminate all but the masked
|> bases.  
|> To create the mask that you need, in the multi align view, select
|> the "View" popup menu to point to Mask # 1, 2, 3 or 4.  A short hand
|> for selecting full columns of bases is to click the column index
|> line above the bases.  Hold mouse down and select many columns.
|> If your bad subset is smaller than your good subset, select the 
|> bad subset, then choose the Sequence/Seq Masks/Invert mask function
|> to select the good subset.  Save the data now to have your mask
|> persist to the next session.  Then do the above 'Compress by mask'
|> and analyze that good data subset.
|> You can also use masks to specify non-contiguous, irregular subsets
|> of data, rather than just columns, if that is of interest.
|> - Don
|> PPS, I know a better manual is needed so one can figure out things
|> like this.
|> --
|> -- d.gilbert--biocomputing--indiana u--bloomington--gilbertd at bio.indiana.edu

I find the mask function in Seqpup as described by Don very useful. I use it to eliminate
all columns with indels that I inserted previously as a visual aid to identify nt blocks
that are supposed to be homologous (on the basis of secondary structure info, for example)
before starting phylogenetic analyses. (I am not sure, though, whether or not this is
really necessary. Any feedback from real experts of molec. phylogenetics would be welcome!)
However, if you have created a fairly complex mask (my data matrix is typically 95 x
1770nts with lots of masked columns) and you then decide that your multiple alignment
needs improvement, is it unfortunate that the mask doesn't change accordingly. It is 
particularly unfortunate that moving only a few columns or nts at the beginning of a long
seq often screws up the mask further down and you spend a lot of time making the necessary
corrections. Still, Seqpup continues to be a very useful tool for me.

Hans Sluiman

More information about the Mol-evol mailing list

Send comments to us at biosci-help [At] net.bio.net