estimation of base accuracy and putting N instead of - in sequence files

Andy Law Andy.Law at bbsrc.ac.uk
Tue Jul 29 03:16:14 EST 1997

In article <5ril0e$ieu at mserv1.dl.ac.uk>, rifat at icr.ac.uk wrote:

 >  Hi,
 >          Recently I had a problem with some sequences in that it was 
 >  giving me a noisy sequence upto 150bp and then good sequence from 
 >  150bp to 500bp. But the problem is that eba program clipped the whole 
 >  sequence and excluded it from the database. How can I change the eba 
 >  parameters so that it becomes more adaptive, i.e. it looks at the 
 >  sequence and if it is noisy at the beginning it will clip it and when 
 >  it gets to good sequence it will leave it (include it) and then clips
 >  the noisy bit at the end of the sequence instead of clipping 
 >  everything as bad sequence.

I altered the pregap script to check the QL/QR tags after quality checking
and before sequence vector clipping (the point at which "iffy" sequences
drop into the .failed file), then offer an interactive clipping step at
that stage as well as at the "normal" stage. It works well and is pretty
easy to implement.

 >  Also, why does Staden change the Ns in the sequence to -  Is there 
 >  anyway I can leave the Ns in the sequence, because when I blast the 
 >  sequences the BLAST software gives a warning about the - 

That sort of thing can also be easily done in a script file. Something
along the lines of

    sed "s/-/N/g" < file.seq

Andy Law
( Andy.Law at bbsrc.ac.uk )
( Big Nose in Edinburgh )

More information about the Staden mailing list

Send comments to us at biosci-help [At] net.bio.net