Using gap to align sequences.

James Bonfield jkb at mrc-lmb.cam.ac.uk
Tue Aug 27 04:29:43 EST 1996

In article <4vffvh$mb1 at mserv1.dl.ac.uk> Rifat Hamoudi <rifat at icr.ac.uk> writes:
>	Does anyone know if 
>1) it is possible to use gap to align 
>the whole of the consensus sequence to a sequence in the database.
>I am having trouble using the Align algorithm because it does not seem 
>to align the whole of the sequence.

I don't think I understand this question. If you have a large consensus with
which you wish to compare a single reading, then use assembly (or find
internal joins if it's already in the database). The 'align' command in the
contig editor only aligns the reading with the same portion of consensus
sequence. It's far from perfect, but is on the (huge) list of things to do.

>2) Also is it possible to use gap to take in ABI files without having to
> convert them to scf or exp files, since having ABI and other file 
>formats takes up alot of disk space.

Exp files are needed (we cannot assemble the binary formats directly), but
they're tiny compared to trace files. However the whole purpose of having SCF
files instead of ABI files is that they're so much shorter. With the latest
SCF version the files can be compressed reasonably well too (using gzip). Gap4
and trev will automatically uncompress them when they're needed for viewing.
The ABI files can be archived on tape and then thrown away.

>3) If you have more than 36 sequences in a single contig is there a way 
>that you could scroll down to look at your sequences or is the only way 
>of doing this is to get a 17 or 20" screen?

This too is on the list of things to do, but at a higher priority to tidying
up the editor align command. We do recognise this as a problem and it ought to
be relatively easy to solve. Note that you don't necessarily need a large
screen at present. Rather you need to drag the window around (which is very
clumsy). Alternatively it's possible to change the editor font, although this
tactic can only be taken so far before things become unreadable. See the
comments at the top of $STADENROOT/lib/gap/contig_editor.tcl.

>4) When you use VEPE to screen against a vector, can you get a 
>chromatogram with the vector sequence left out and just your internal 
>sequence in a separate chromatogram.

This goes against our philosophy. We wish to never completely throw away any
data. All the original information should be available during the project. If
the wish for this is simply space requirements, consider that SCF files are
typically around 1/2 the size of ABI files, and that gzipped SCF files are
around 1/5 the size of ABI files.

James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 412282
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net