IUBio

file splitting....help

Bernard P. Murray, PhD bpmurray*STUFFER* at socrates.ucsf.edu
Wed Feb 17 17:03:00 EST 1999


In article <7aejmp$rfq$1 at mserv2.dl.ac.uk>, "jayakumar"
<jakku at mrna.tn.nic.in> wrote:

> HI
>     I need to know where I can get some utility for splitting a large text
> file into easily manageable files.  But the splitting should occur when ever
> a particular word or parameter occurs in the target file. Actually, I want
> to split a large file containing searched abstracts from CCOD database, into
> individual abstract files. So the utility must split the file whenever it
> encounters a word like "title" or a large space etc.  Can you help me in
> this matter.   I would be very grateful for this help.
>     thanks in advance
> sincerely
> jayakumar

I believe the unix command csplit(1) is intended for this
purpose.  I am not familiar with it but something like
     csplit -n3 sourcetext '/^title/' {100}
will give you the first 100 blocks that have the first
line beginning with title.  These will be placed in
separate files called xx000 to xx099.  I haven't tested
this so check the man page.

Alternatively
Does your machine have AWK on it (or can you install a copy)?
It should be reasonably easy to do this with a simple AWK
script and AWK (or GAWK) is available for most platforms.

(You will almost certainly hear Perl suggested so I will
leave that to the Perl experts (I'm very much a Perl newbie)).

     Good luck,
          Bernard
-- 
Bernard P. Murray, PhD
Dept. Cell. Mol. Pharmacol., UCSF, San Francisco, USA




More information about the Bio-soft mailing list

Send comments to us at biosci-help [At] net.bio.net