Sorry for posting this, mail to the sender bounced:
Sequences not recognized because they are not quite right.
Two problems: 1) GCG uses "." for gaps, this file has "-". This can
be fixed with a text editor search/replace.
2) Checksum problem. This can be fixed by running it through
"reformat -msf gde842_9.msf{*}"
Note that if you run reformat (at least through V8.0) without first
changing the gap character it will remove all your gaps (probably
not what you want).
The third problem is that ReadSeq saves all MSF file as nucleotide, so
protein files have to have type changed manually (not a problem here).
>Does anyone know why MSF files formatted by readseq are not recognized by GCG
>Version 8.1? (I don't know whether they were recognized by earlier versions).
>>The file test.msf as written by readseq (through gde) is as follows:
>> gde842_9 MSF: 100 Type: N January 01, 1776 12:00 Check: 9397 ..
>> Name: test1 Len: 100 Check: 1581 Weight: 1.00
> Name: test2 Len: 100 Check: 6389 Weight: 1.00
> Name: test3 Len: 100 Check: 1427 Weight: 1.00
>>//
>> test1 AAACGATGCA CATATGTATT GTGCTCTAGA TACAGCATCA ---AGCTCTA
> test2 AAATGATGCA CACATGTACT GTGCTTTAGA TACAGCACAA CAGAGTGCTA
> test3 AAAAAGTGGT GCGGAATCTC TGGCAGCTAT TACCCGCGAC GCTAACATTA
>> test1 CTGCAGGAGC AACT------ ACATCTGTTA TGGTAAAAAA TGAAAATTTA
> test2 CTAATGGTGC AACATTAGCT TCATCTGTTA TGATAAAAAA TGAAAATTTA
> test3 CTGAG----- -------ACC AATTACTTCG TAGTCAAAAT TGAGAAATTA
>>>If I try to use "distances" on this file I get:
>> *** ERROR, bad sequence format in test.msf ! ***
> *** No files in test.msf ! ***
>>Thanks in advance to anyone who can help.
>>--
>---
>Basil Allsopp | E-mail basil at ovisun.ovi.ac.za>Onderstepoort Veterinary Institute | Phone +27 12 5299385
>Onderstepoort 0110, South Africa | Fax +27 12 5299431
-------------=-=-=-=-=-=-=-------------=-=-=-=-=-=-=-------------
Michael Lonetto ** 415-476-1493 ** lonetto at cgl.ucsf.edu
UCSF Depts. of Stomatology and Micro.,San Francisco,CA 94143-0512
=-=-=-=-=-=-=----- http://terminator.ucsf.edu/ -----=-=-=-=-=-=-=