In article <3211DDB7.5D62 at bbsrc.ac.uk> Aengus Stewart <aengus.stewart at bbsrc.ac.uk> writes:
> Recently I have split the EST sections of from the main
> part of GENBANK and EMBL
>> EMBL - everything except ESTs
> EMBLPLUS - everything and ESTs
>> same for GENBANK
>> If I now try to FETCH either using just a sequence name or
> accession number the sequence cant be found.
>> If the subsection is included then it will find it.
>> eg hs00001 is not found but est:hs00001 is!
>> I have a sneaking suspicion that this is sitting right in front
> of me and I cant see it.
There is a cure for this kind of blindness. Be careful, though, as it can
lead to a side effect of double vision :-)
When you run FETCH without a database specification, it looks in
"GenData:", which translates to "@GenRunData:gendata.fil"
This should contain all the EMBL divisions, under whatever names
you use for them. It appears that the EST ones may be missing,
or may no longer be correctly defined.
The gendata.fil file has to have the actual database file. When I
checked and found the same problems, I discovered that I had "em_est"
which I had to expand to "em_est1" ... "em_est6". A sign that it has not
been used since em_est2 appeared :-)
For the current release of EMBL there is no entry hs00001
but the effect is shown by, for example, entry hs00000a1.
Double vision? Make sure you don't put anything in the file twice
otherwise FETCH will find entries twice when using this method
(but only once when you provide the database name).
--
------------------------------------------------------------------------
Peter Rice | Informatics Division,
E-mail: pmr at sanger.ac.uk | The Sanger Centre,
Tel: (44) 1223 494967 | Wellcome Trust Genome Campus,
Fax: (44) 1223 494919 | Hinxton, Cambridge, CB10 1SA,
URL: http://www.sanger.ac.uk/~pmr/ | England