Collegues,
After installation of GCG V8.1, and re-indexing of my databases, I noticed a
strange behaviour of the software when trying to read the (full) EMBL database:
the programs (like FASTA, DBSTATS, EQUICKINDEX, ...) who all try to read the
whole database, stopped reading with the following error message:
*** ERROR in SQNext. Sequence reading is out of synch.
This appeared to happen only in those database sections that contained split
sequences, i.e. EM_BA, EM_FUN, and EM_PR. The sequences were split in the
old GCG way: when longer than 350k bases, they were split in 350k chunks.
However, when I reformatted these sections according to the new GCG scheme
(i.e. when longer than 350k, split into 100k chunks, plus a 10k overlap),
and reindexed them, it all worked perfectly again.
So, if you do your own database conversions, reformat EMBL from flatfile
before indexing under V8.1! EMBLTOGCG will do the trick. If however, you
insist on having your databases in NBRF format (like I do), then pick up
the latest version of "embl2nbrf" from my FTP-site:
ftp://ftp.caos.kun.nl/pub/molbio/embl2nbrf
Best regards,
Jack
Jack A.M. Leunissen, Ph.D. | CAOS/CAMM Center, Univ. of Nijmegen
Email: jackl at caos.kun.nl | Toernooiveld
Tel. : +31 80 65 22 48 | 6525 ED Nijmegen, The Netherlands
Fax : +31 80 65 29 77 | URL=http://www.caos.kun.nl/