IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

EMBLtoGCG

Rodrigo Lopez rodrigol at bioslave.uio.no
Wed Apr 17 01:51:37 EST 1996


Dear GCG'ers, 

I recieved a message from Lynn Miller from GCG suggesting a much better
and robust fix for the problem of corrupted headers when using: 

% embltogcg -prot 

The proposed fix from Lynn takes into consideration a side effect when
using StringSearch which I didn't check as I no longer use it (!). And
some further instructions for building the fix, for those who have not
build the GCG libraries. Here is Lynn's fix which I suggest everyone uses
rather than mine: 

-----------------------------------------------------------------------

The solution recently posted will correct the problem with the "PROTEIN"
command line option for EMBLToGCG, but it will also cause the "SWISSPROT"
command line option to set the SN for SwissProt to "SwissProt" instead of
"SW".  As far as I can see, this will mainly affect the output from the
StringSearch application, which uses the value of SN in it's list file
output.  The result will be less readable output when StringSearch is used
to search SwissProt. 


**********************************************************
 NOTE:  This only should be done if you are formatting
        an EMBL-format protein database with the PROTEIN
        command-line switch.  This switch is ONLY useful
        for formatting databases like TREMBL and incremental
        updates to SwissProt (SwissProt full releases should be
        formatted with the SWISSPROT command line option)
***********************************************************

An alternate source code modification is: 

change from:

  if (IsSwiss || IsProtein)
    strcpy(header->type, "P"); 
  else {
    CapTokens(shortName); 
    strcpy(header->shortName, shortName); 
    strcpy(header->type, "N"); 
  }


Add lines __after__ the above lines so that it now reads:

  if (IsSwiss || IsProtein)
    strcpy(header->type, "P"); 
  else {
    CapTokens(shortName); 
    strcpy(header->shortName, shortName); 
    strcpy(header->type, "N"); 
  if ( IsProtein ) }
    strcpy(header->shortName, shortName); 

This source code change will fix the problem with the -PROTEIN option
without affecting the behaviour of the -SWISSPROT option. 

The build instructions in the previous posting are basically correct, but
I suggest changing the command: 

% fetch makefile.mm 

to 

% fetch gendocdata:makefile*

The previously posted instructions also assume that the shared object
libraries and make files have been built.  If this is not true for your
system, you will also need to add the commands: 

% bldmkfiles 
% buildlib


**********************************************************
 NOTE:  Before building any libraries or applications, please
        be certain that you have the supported versions
        of the Fortran and C compilers. 
**********************************************************
-------------------------------------------------------------------------

Many thanks to Lynn for these corrections/additions!


Enjoy!



R:)




More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net