Keith Vass kvass at compserv.gla.ac.uk
Tue Dec 17 07:36:31 EST 1996

MIcha Ron wrote:
> Dear netters,
> I have a question about GCG software.  How do I create a new subset of
> sequences from an output of stringsearch.  I would like to do a fasta
> search with all the output sequences recieved from the stringsearch.
> Mark Band
> michar at indycc1.agri.huji.ac.il
Francois Jeanmougin wrote:        Like lookup, stringsearch gives you a
file of sequence names (FOSN)
formated output. So you can use fasta as follow :
fasta my_sequence
Search for query in what sequence(s) (* SwissProt:* *) ?
the @ says to GCG that this is a FOSN file, and GCG will take all the 
sequences to make the search . You don't need to create a database,
nor to copy all the sequences (no need of disk space).

This is completely Sensible. 
But it is useful to define a SUBSET of the sequences identified by
stringsearch. In unix, to identify all mouse sequences in genbank names
you can do this with a file called mydata.strings:
egrep -i mouse mydata.strings > mymousedata.strings

this will put all the mouse entries into a new file called 

Keith Vass

More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net