In article <keith-280394111320 at mac09.biochem.ualberta.ca>,
keith at bones.biochem.ualberta.ca (Keith Robinson) wrote:
> In article <9403251611.AA23221 at gcrc.scripps.edu>, yagi2 at SCRIPPS.EDU (Akemi
> Yagi/BCR7 4-8094) wrote:
>> > Hello GCG users,
> >
> > When running a search program such as FINDPATTERNS, is there any way
> > to restrict the search only within cDNA or mRNA excluding genomic DNA etc?
>> There is no way that I've found to specify a cDNA search in GCG. This
> method works for me, though it's not 100% foolproof:
>> * Do a 'stringsearch -noheading -noscreen -nomonitor -outfile=cdna.txt'
> of the database for all occurances of the text string "cDNA". This will
> save a list of all the sequence names into a file named "cdna.txt"
>> Keith
> --
> Keith Robinson Dept. of Biochemistry
> The University of Alberta Edmonton, Alberta Canada
Keith and Akemi:
I agree that STRINGSEARCH is the way to go for this. I recently did a
similar
search. Doing a couple of "practice" searches, I found that there was no
single
keyword or words which always identified cDNAs. "cDNA" was actually quite
rare.
The prefered term seems to be "cds" (whatever that might mean). I ended up
doing a STRINGSEARCH on <cds, mRNA, exon, " gene", onco>. Notice that "
gene" is in quotes and is preceeded by a space -- this precludes the
finding of
"pseudogene," which is common. The <onco> is to take into account the
several
instances which were identified only as "oncogene."
Good luck,
Clint