E. coli Database Collection --> GCG format

Peter Rice pmr at staffa.sanger.ac.uk
Thu Nov 3 04:45:14 EST 1994

In article <398sqb$79b at emory.mathcs.emory.edu> bcresas at bimcore.emory.edu (Scott Sammons) writes:
>   Has anyone successfully reformatted the ECD data into GCG formatted databases.
>   The program embltogcg core dumps when I try it with one of the ECD .dat
>   files.

As I am involved in both ECD and EGCG, I will try to put something into EGCG 8.0
when it is ready. I take it you are referring to the very latest (new format)
ECD here. Do you mean the genorf.dat file or the contigs/*.dna files (which are
closer to EMBL format and have more sequence data)?

Beware though - E.coli sequencing is now going so well that ECD has a contig (and
more to follow) over the 350k mark, which will give some problems with GCG.

Another option would be to use a script (or Perl) to reformat into enough of an
"EMBL" format for EMBLtoGCG to accept it.

Peter Rice                           | Informatics Division
E-mail: pmr at sanger.ac.uk             | The Sanger Centre
Tel: (44) 1223 494967                | Hinxton Hall, Hinxton,
Fax: (44) 1223 494919                | Cambs, CB10 1RQ
URL: http://www.sanger.ac.uk/~pmr    | England

More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net