IUBio

fromembl in gcg

Tim Bolling bollingt at ugene1.abbott.com
Mon Mar 28 12:39:40 EST 1994


> jenkins at aidsun.nibsc.ac.uk wrote:
> 
> : I have copied from the states the Los-Alamos hiv dbase (embl format).
> 
> : Using fromembl (unix flavour, SGI machine) I broke the flat file into
> : individual protein sequences.
> 
> : The output files containid "$" in the name eg env$u445.embl.  My flavour
> : of unix will not recognise this dollar symbol, so i can't do anything
> : with these sequences.
> 
> I would suppose that you have troubles with the $ because it is a special
> sign for almost all unix shells, designating a shell variable.
> 
> There are two ways to circumvent this.
> 1) You can force the shell to recognize the $ sign by putting a backslash
>    in front of it, e.g. env\$u445.embl instead of env$u445.embl.
> 2) You can quote the name of the file in single quotes, e.g.
>    'env$u445.embl' instead of 'env$u445.embl'.
> 
> Hope that helps,
> 
> --Cornelius.
Content-Length: 563

Alternatively, you might want to replace all the $s in the original flat file
with an underscore (_) and then run the fromembl program to convert.  Warren
Gish (NCBI) gave me this tip when I experienced a similar kind of problem with
a GenBank flatfile (I had to remove carriage returns).  Here are the steps:

1.  sed -e 's/\$/_/' hiv.flat > hiv.sub
(This line will replace the $ with _ and route it to the file, hiv.sub.  Make
sure that you type the substitution parameter EXACTLY as above.)

2.  fromembl hiv.sub (etc)

Tim Bolling
bollingt at ugene1.abbott.com



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net