IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

MSF problem?

L. H. Bell lhb at s-ind1.dl.ac.uk
Mon Aug 1 09:17:36 EST 1994


In article <gbga13-300794123016 at mac7-43.genetics.gla.ac.uk>, gbga13 at udcf.gla.ac.uk (B.L.Cohen) writes:
|> I'm having "Cannott Open" of "Cannot read" problems with MSF files produced
|> by the Export Foreign format function of GDE (i.e. Gilbert's Readseq) and
|> entered as input as follows: Plotsim filename.msf{*}
|> 
|> The files look like OK MSF DNA multiple alignment files to our eyes.   
|> 
|> Can anyone point to possible problems?   

This looks interesting. when you output proteins from GDE as MSF, they 
load fine into GCG programs but when you try outputting DNA, you get 
this error. However running the corrupt MSF sequence through readseq
again (readseq -a -p -form=msf <dna.msf >dna2.msf) produces a working 
MSF file that loads into GCG. The test file contained 3 DNA sequences,
all 550 bases long. The first sequence was named SYNR. A diff of the 
files looks like this.

s-crim1:lhb 140> diff dna.msf dna2.msf 
2c2
<  gde26558_1  MSF: 550  Type: N  January 01, 1776  12:00  Check: 8511 ..
---
>  dna2.msf  MSF: 550  Type: N  January 01, 1776  12:00  Check: 5077 ..
4c4
<  Name: SYNR             Len:   600  Check:  9901  Weight:  1.00
---
>  Name: SYNR             Len:   550  Check:  6467  Weight:  1.00

My gde fix would be to create a new file item, 'export as msf' as 
itemmethod:readseq -pipe -all -form=msf < in1 > out1 ; readseq -p -a 
-form=msf <out1 > $OUTPUTFILE
and remove the msf option from the 'export foreign format'.

Anybody know why readseq is behaving like this and so suggest a more
elegant fix? gde pipes the sequences in GENBANK format into readseq.

Hope this was helpful,

Lachlan Bell



More information about the Info-gcg mailing list

Send comments to us at biosci-help [At] net.bio.net