MIcha Ron wrote:
>> Dear netters,
>> I have a question about GCG software. How do I create a new subset of
> sequences from an output of stringsearch. I would like to do a fasta
> search with all the output sequences recieved from the stringsearch.
>> Mark Band
>michar at indycc1.agri.huji.ac.ilFrancois Jeanmougin wrote: Like lookup, stringsearch gives you a
file of sequence names (FOSN)
formated output. So you can use fasta as follow :
Search for query in what sequence(s) (* SwissProt:* *) ?
the @ says to GCG that this is a FOSN file, and GCG will take all the
sequences to make the search . You don't need to create a database,
nor to copy all the sequences (no need of disk space).
This is completely Sensible.
But it is useful to define a SUBSET of the sequences identified by
stringsearch. In unix, to identify all mouse sequences in genbank names
you can do this with a file called mydata.strings:
egrep -i mouse mydata.strings > mymousedata.strings
this will put all the mouse entries into a new file called