Hi Iain,
> I am trying to return the embl entries for a list of uniprot entries.
> I use the following command.
> getz '(@testing > embl)'
> where the file testing contains:
> uniprot:CYGB_MOUSE
> uniprot:GLB1_SCAIN
>> The output is:
> EMBL:AK019410
> EMBL:MMU315163
> EMBL:BC055040
>> Is there any way of viewing the Uniprot ID's aswell as the EMBL ID;
> My ideal output would be
> EMBL:AK019410 UNIPROT:CYGB_MOUSE
> EMBL:MMU315163 UNIPROT:CYGB_MOUSE
> EMBL:BC055040 UNIPROT:CYGB_MOUSE
>> I have tried getz '(@testing > embl) > uniprot'
> but this only returns one entry, rather than three..
>> I want to parse out the results into individual files according to the
> uniprot id.
>> I believe it is possible using views and wgetz, but I would prefer not
> to use wgetz
A simple solution is to use a shell script to do the relevant
processing. For example:
#!/bin/sh
tab=`echo "\t"`
for ln in `cat testing`; do
getz "[$ln]>embl" | sed "s#\$#$tab$ln#"
done
This produces your desired result, but is inefficent for large lists of
ids since each id is processed using an individual getz call.
If your set of ids is the product of a query you could use an Icarus
script to do the processing instead, and avoid some of the overhead
involved in the getz calls.
Hamish
--
============================================================
Mr Hamish McWilliam
European Bioinformatics Institute
Wellcome Trust Genome Campus
Hinxton, Cambridge, CB10 1SD, UK
URL: http://www.ebi.ac.uk/
============================================================