[Toybox] [PATCH] A implemetation of the 'csplit' command

Oliver Webb aquahobbyist at proton.me
Tue Sep 12 21:43:15 PDT 2023


------- Original Message -------
On Tuesday, September 12th, 2023 at 2:36 PM, Rob Landley <rob at landley.net> wrote:


> On 9/11/23 23:56, Oliver Webb via Toybox wrote:
> 
> > I have made a implementation of the 'csplit' command in about 160 lines of code.
> 
> 
> You have TOYFLAG_MAYFORK on this command. Sigh, explaining the lib/toyflags.h
> values is one of the tutorial videos I need to make.

You do cover the behavior of the NOFORK and MAYFORK flags
in your video on the true/false commands

> I dunno why csplit would want MAYFORK here. A normal command can just xexit()
> and let the kernel close filehandles and free memory when the process exits. I
> note that 95% of the overhead of fork/exec is the exec part, not the fork part,
> so "fork and call toy_find("blah")->toy_main()" is still pretty cheap. (On

I have removed the MAYFORK flag in the implementation.

> > The other main one is the fact it doesn't do "[LINE] {[NUMBER]}" cleanly yet.
> 
> 
> I applied what you sent verbatim and haven't started cleaning anything up yet,
> if you have more work to do I'm not actually familiar with csplit. (Never used
> it, still need to come up to speed...)

I do have another patch to submit, it fixes that LINE {NUMBER} thingy, 
with the created problem that it doesn't do "[RULE] {NUMBER} LINE" correctly
because it resets the line number every time it encounters "{}" rules.

After a more careful reading of the POSIX standard, I realized that the "%regexp%" rules
don't function like "/regexp/1" and _exclude_ lines up to the pattern, After some fiddling with the code
that switches files and fprintf-s lines to them, I got the "%regexp%" rules to work like they should:

  $ seq 10 | ./csplit - 2 %4% 7 -s
  $ cat xx0*
  1
  4
  5
  6
  7
  8
  9
  10
 
  $ seq 10 | csplit - 2 %4% 7 -s
  $ cat xx0*
  1
  4
  5
  6
  7
  8
  9
  10


The reason why -s is passed to both is because I have not gotten file size output
to work consistently yet. The implementation stat()'s written files which return byte sizes
that differ from the coreutils csplit I am testing against.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 9e2b87dc.patch
Type: text/x-patch
Size: 3150 bytes
Desc: not available
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20230913/e51e3ecf/attachment.bin>


More information about the Toybox mailing list