joew at base4.com (Joe Wang) writes:
>> According to the SRS menu, if I want to get all human
> CDSs from GenBank, I should do:
> getz '[genbank-org:homo] \> [genbank-ftk:cds]' -f seq
>> Instead of the CDSs, I get the entire sequence tries.
>> When I do:
> getz '[genbank-ftk:cds] \< [genbank-org:homo] ' -f seq
>> I get all the CDSs from GenBank (not only the human ones).
>> If I don't use the escape "\", I get nothing.
>> It looks like that the linking function ">" or "<" is not working.
>> Any suggestions?
The "\>" is wrong - you should just use ">" or "<".
I suspect your GenBank release does no have "Homo" explicitly in the taxonomy.
Maybe the human entries are indexed only as "homo sapiens". If so, the
[genbank-org:homo] part of the query will not work.
You can quickly check this with:
(srs5) getz -lv '[genbank-org:homo*]'
(srs4) getz -rep '[genbank-org:homo*]'
Yikes - just tried that here with EMBL 51, and found a really weird one:
% getz -lv '[embl-org:homo*]'
////////////////////////////////////////////////////////////////////////////
Values in "EMBL"
homo 829492
homo sapiens 829491
////////////////////////////////////////////////////////////////////////////
report of query "[embl-org:homo*]"
.. and the reason is, two sets of OS and OC lines:
% getz "[embl-org:homo\!homo sapiens]" -e|more
ID H17BHYD standard; DNA; HUM; 21764 BP.
XX
AC M84472;
XX
NI g806392
XX
DT 25-DEC-1992 (Rel. 34, Created)
DT 13-OCT-1996 (Rel. 49, Last updated, Version 11)
XX
DE Human 17-beta-hydroxysteroid dehydrogenase (EDH17B1 and EDH17B2)
DE genes, complete coding regions and flanks.
XX
KW 17-beta-dehydrogenase; 17-beta-hydroxysteroid dehydrogenase;
KW Alu repeat; estradiol.
XX
OS Homo
OC Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata;
OC Vertebrata; Eutheria; Primates; Catarrhini; Hominidae.
XX
OS Homo sapiens (human)
OC Eukaryotae; mitochondrial eukaryotes; Metazoa; Chordata;
OC Vertebrata; Eutheria; Primates; Catarrhini; Hominidae; Homo.
XX