In article <7aejmp$rfq$1 at mserv2.dl.ac.uk>, "jayakumar"
<jakku at mrna.tn.nic.in> wrote:
> HI
> I need to know where I can get some utility for splitting a large text
> file into easily manageable files. But the splitting should occur when ever
> a particular word or parameter occurs in the target file. Actually, I want
> to split a large file containing searched abstracts from CCOD database, into
> individual abstract files. So the utility must split the file whenever it
> encounters a word like "title" or a large space etc. Can you help me in
> this matter. I would be very grateful for this help.
> thanks in advance
> sincerely
> jayakumar
I believe the unix command csplit(1) is intended for this
purpose. I am not familiar with it but something like
csplit -n3 sourcetext '/^title/' {100}
will give you the first 100 blocks that have the first
line beginning with title. These will be placed in
separate files called xx000 to xx099. I haven't tested
this so check the man page.
Alternatively
Does your machine have AWK on it (or can you install a copy)?
It should be reasonably easy to do this with a simple AWK
script and AWK (or GAWK) is available for most platforms.
(You will almost certainly hear Perl suggested so I will
leave that to the Perl experts (I'm very much a Perl newbie)).
Good luck,
Bernard
--
Bernard P. Murray, PhD
Dept. Cell. Mol. Pharmacol., UCSF, San Francisco, USA