NCBI UniGene files

jkb at mrc-lmb.cam.ac.uk jkb at mrc-lmb.cam.ac.uk
Tue Jan 30 09:46:31 EST 2001

In <3A764BFE.5217FA63 at staff.usyd.edu.au> Bill Blackhall <b.blackhall at staff.usyd.edu.au> writes:

> How can I get Pregap to recognise the individual ESTs inside the cluster
> file that NCBI's UniGene database generates? Sequencher will align them
> all simply by importing the cluster file, but Pregap and Gap only see
> the cluster file as one long sequence rather than picking out the
> individual ESTs. Any help will be welcome.

What format are these cluster files? We don't support any direct
multi-sequence files formats in pregap4 (although we probably ought to) so
regardless of format you'll need to separate them out into multiple files.

There's a fasta2exp program (actually it's an nawk script, so you may need to
edit the first line to change the awk interpreter if it's not in /usr/bin)
that will split the fasta file into multiple experiment files. This doesn't
help matters if you want trace files visible though.

James Bonfield (jkb at mrc-lmb.cam.ac.uk)   Tel: 01223 402499   Fax: 01223 213556
Medical Research Council - Laboratory of Molecular Biology,
Hills Road, Cambridge, CB2 2QH, England.
Also see Staden Package WWW site at http://www.mrc-lmb.cam.ac.uk/pubseq/

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net