IUBio Biosequences .. Software .. Molbio soft .. Network News .. FTP

$Seq.trunc and $Seq.app in 5.1

M.P. Hilbers mph at dl.ac.uk
Mon Feb 15 10:30:42 EST 1999


"K. Cuelenaere" wrote:
> 
> Hi there,
> 
>  perhaps these are problems already solved; if so, I'm afraid
> I've missed it in this newsgroup.
> 
> It's about the split files:
> 
> the trunc c function wasn't ok: I changed the commented line (by me)
> and now it works fine (this must have been already solved at
> the other srs servers because they seem to "trunc" ok). However
> the ftp-able distribution on ftp.ebi still contains the old SeqIopTrunc
> function.
> 
> SEQv SeqIopTrunc (Int4 runTime, SEQv seq, Int4 l)
> {
>   if (runTime)
>     IargGetArgs ("seq|len", &seq, &l);
> /*  SeqSubSeq (seq, 1, SeqLen (seq)); oops, always returns the entire sequence*/
>   SeqSubSeq (seq, 1,  SeqLen (seq) - l);
>   IcaReturn ("Seq", seq);
>   return seq;
> }
> 
> And about the use of $s.app in the embl.is (and others): I don't get it.
> One should first do all the $s.app and finally a $s.make.
> But in embl.is the $s.make is done in the seq production, which
> can be called several times for one sequence when dealing with
> split sequences.... So the result is (with our embl) that only the last
> sequence part is returned. The same is achieved in an icarus test file:
> 
> $s.app:"abc"
> $s.make
> $s.app:"def"
> $s.make
> $Print:$s.str
> 
> results in: "def"
> 
>  So, ok, one should use the make just after the
> last app, but how come it works right with the other srs servers, while
> I can't notice a difference with our embl.is file ... was there
> also a change in the SeqIopApp c function? Most other srs servers
> are dealing well with split entries, but e.g. BEN doesn't.
> 
> best regards,
> Koen.
> --

There are probably a lot of sites who use native format, and thus
don't have this problem..

There seem to be various problems here. I found the $s.trunc crashes
without a $s.make done before it, so you need a $s.make in the "seq"
production.. But $s.app appears not to be able to append a sequence
once it has been "made", it puts it just at position 0...
So the choice is apparently to present only the last part of the sequence, 
or the corrupt the sequence by not using trunc....

As a workaround I did the following:

I "activated" the seqcat function in seq.c (created SeqIopCat):

void SeqCat (SEQv to, SEQv from)
{
  StrApp (&to->seq, from->seq);
}

SEQv SeqIopCat (Int4 runTime, SEQv to, SEQv from)
{
  if (runTime)
    IargGetArgs ("to|from", &to, &from);

  SeqCat (to,from);  
  IcaReturn ("Seq", to);
  return to;
}


Then in bi_seq.i:

in Seq methods added:

    $Com:[cat f:SeqIopCat return:object returnclass:@Seq
      rem:"Concatenate two sequences."
      args:{
    	$Var:[to this:y t:object class:@Seq]
    	$Var:[from unnamed:y t:object class:@Seq]
      }
    ]

in Sequence functions added:

     $Com:[SeqCat f:SeqIopCat return:object returnclass:@Seq
      rem:"Concatenate two sequences."
      args:{
    	$Var:[to this:y t:object class:@Seq]
    	$Var:[from  unnamed:y t:object class:@Seq]
      }
    ]
   ]


Recompile....

Now in embl.is:


  gcgseq:  ~ { $In:[file:seq] $Out
                 if:$entryName && $en!=$entryName {
                   $Print:|Entry name mismatch:  ref=($entryName), seq=$en
                   $Exit
                 }                   
               $Wrt:[s:" "]}
               '>>>>' {$seqFip=$Fip} 
               (/([A-Z0-9]+)_0/ {$en=$1 $s=$Seq:$Ct} seq
               (/>>>>[A-Z0-9]+_0?[1-9]+|>>>>[A-Z0-9]+_[1-9][0-9]+/ 
               {$s.trunc:10000} aseq )+  
               |/[A-Z0-9]+/ { $en=$Ct $s=$Seq:$Ct} seq ) ~ 


  seq:       ~ /.*2BIT *Len: */ /[0-9]+/ {$len=$Ct} ln ln 
	       {$s.get2Bit:[$JobFile len:$len]} 
               ('>>>>'{$Not} ln)* |
               /.*ASCII/ ln ln ('>>>>' {$Not} /.*/ {$s.app:$Ct})+       
                x {$s.make} 
              ~

  aseq:       ~ {pre $as=$Seq:"temp"}
               (/.*2BIT *Len: */ /[0-9]+/ {$len=$Ct} ln ln 
	       {$as.get2Bit:[$JobFile len:$len]} 
               ('>>>>'{$Not} ln)* |
               /.*ASCII/ ln ln ('>>>>' {$Not} 
               /.*/ {$as.app:$Ct})+)       
               x {$as.make $s.cat:$as} 
              ~


So the first (or only) section his handled by seq (where $s.make is also done),
for subsequent sections, $s is first truncated by 10000, and then the production 
aseq reads the new section as $as, which is "made" and then added to $s with $s.cat

Cheers,

Martin



-- 
-------------------------------------------------------------------
| Martin Hilbers http://www.dci.clrc.ac.uk/Person.asp?m.p.hilbers |
| SEQNET                |     E-mail: m.p.hilbers at dl.ac.uk        |
| Daresbury Laboratory  |     Tel:    +44-1925-603492             |
| Daresbury, Warrington |     Fax:    +44-1925-603100             |
| Cheshire WA4 4AD      | SEQNET is the UK national EMBNet node   |
| United Kingdom        |     http://www.seqnet.dl.ac.uk/         |
-------------------------------------------------------------------




More information about the Bio-srs mailing list

Send comments to us at biosci-help [At] net.bio.net