IUBio

FASTA format - No funny characters!!

Roland Walker walker at ncbi.nlm.nih.gov
Fri Nov 6 11:55:17 EST 1998


[John.Ledwith at CADUS.COM writes]
> Thank you, Fran?ois.  But lets say I want to set $/ (the input record
> separator) to ">".  (Baba) O'Reilly says that you can set it to a
> multi-character string, but "^>", "\n>", even "\s>", isnt cutting it (no
> pun intended).

#!/bin/perl
$/="\n>";
while (<>) {
  chomp;
  s/\A>?([^\n]*)\n/>/;
  tr/\n//d;
  my $defline=">$1";
  my $sequence=$_;

  $defline =~ s/\A([^\cA]{1,80})/$1/;   # Example: 
                                        # limit defline to max 80 columns
                                        # or up to the first control-A


  # other transformations left as
  # an exercise to the reader

  print "$defline\n$sequence\n";

}

-- 

Roland Walker
walker at ncbi.nlm.nih.gov
National Center for Biotechnology Information




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net