[Toybox] [PATCH] grep: add --line-buffered and fix regular buffering.

enh enh at google.com
Mon Feb 25 12:21:16 PST 2019


On Mon, Feb 25, 2019 at 12:01 PM Rob Landley <rob at landley.net> wrote:

> On 2/25/19 11:29 AM, enh wrote:
> >     To be honest I was calling xflush() from so many places because it
> was an easy
> >     way to get the if (ferror) perror_exit("write"); Possibly what I
> want is an
> >     xferror() that xflush() can call...
> >
> >
> > where would you want to call xferror that xflush wouldn't be appropriate?
>
> Anywhere you want to stop a command after "head" ate the first screen of
> data
> and closed the pipe, or similar. (cmp <(blah) <(blah), yes | anything,
> etc.)
>

assuming s/screen/buffer/...


> (It's basically another facet of the sigpipe problem.)
>
> > (fread
> > is the main place one needs ferror normally, and toybox has remarkably
> few calls
> > to fread, neither of which seem to actually need to care.)
>
> The only _input_ advantage of FILE * is really for getline(). Everything
> else
> it's just as easy to read input blocks and chop 'em up myself.
>
> (The problem with getline() is you don't know where the terminators are
> ahead of
> time so you never know how _much_ to read, so you either read a byte at a
> time
> which is _painfully_ slow or you have leftover data you need to keep around
> because zcat | thingy isn't seekable input.)
>
> Write collation's generally fun, and you can sing "nagle" to the dreidl
> song,
> but then we spend all our time arguing about flushing. :)
>
> P.S. I'm _so_ glad dprintf() made it into posix-2008. Pity there isn't a
> dscanf() but that gets back to getline on unbufferd filehandles being hard.
>
> >     Sigh, we should probably make the helpful ones explicit and remove
> the rest.
> >
> > sgtm.
> >
> > looking at all the callers, xputs isn't used much at all (21 calls in
> toys/) and
> > most of them seem dubious. xprintf is a lot more popular (338), but --
> though
> > it's harder to tell because there are so many -- nothing particularly
> > convincingly in need of a flush stood out.
>
> Before you go _too_ far down that path...
>
> What I propose is having xprintf() and xputc() and friends still do the
> "check
> for error and xexit()", but _not_ do the flush. Call xflush() explicitly
> when
> you need a flush.
>

right, so they'd bail you out early from a buffer-full flush, but otherwise
be just like the direct call.

sgtm.


> > i'm assuming this will be like FLAG, where we'll do it as we're touching
> things
> > for other reasons?
> >
> >     I'll add a cleaup pass to the todo heap...
> >
> > (you should probably keep that list checked in, or even stored as issues
> in the
> > github bug tracker.)
>
> It does not make any sense to anyone other than me, and is not nearly as
> organized as you think. Attached is my "top of the heap" file. Which is
> only one
> of the many "not checked in" files at the top of my working toybox tree:
>
>   $ git status | grep -v / | grep -v '[.]sw' | wc
>        77     116    1165
>
> Which is not one of the entries in the "todo" subdirectory there:
>
> $ ls todo
> 19.patch           howto.txt       ootree.patch            todo2.txt
> bc.lib             iconv.txt       patches                 todo3.txt
> blah               ifconfig.txt    patch.patch             todo4.txt
> blah2              lib.patch       pending.patch           todo.android
> config2help.patch  lsofdiff.patch  projects.txt            todo.small
> date.patch         ltrace.sh       pscomments.patch        todo.txt
> dest               mdev2.patch     ps.txt                  tofix.txt
> explicit.patch     mdev.patch      release.txt             torelease.txt
> explorer.patch     mdev.txt        sub                     toysh.c
> file2.diff         meep.patch      sub2                    wc2.patch
> file.diff          needed.txt      sub3                    wc.patch
> getconf2.c         netcat.patch    temp.patch              zcat.txt
> getconf.c          net.diff        test.txt
> getconf.sh         nettest         test_xargs.diff
> gzip.txt           newsh.c         this is a longish name
>
> Which is not one of the 34 changed files "git diff" shows with notes to
> self like:
>
> --- a/toys/other/losetup.c
> +++ b/toys/other/losetup.c
> @@ -4,7 +4,7 @@
>   *
>   * No standard. (Sigh.)
>
> -USE_LOSETUP(NEWTOY(losetup, ">2S(sizelimit)#s(show)ro#j:fdca[!afj]",
> TOYFLAG_SB
> +USE_LOSETUP(NEWTOY(losetup, ">2S(sizelimit)#s(show)ro#j:fdcaA[!afj]",
> TOYFLAG_S
>
>  config LOSETUP
>    bool "losetup"
> @@ -29,6 +29,7 @@ config LOSETUP
>      -o Start association at OFFSET into FILE
>      -r Read only
>      -S Limit SIZE of loopback association (alias --sizelimit)
> +    -A Auto-detach device when unmounted
>  */
>
>  #define FOR_losetup
>

(that kind of thing especially can be done as a TODO: at the top of the
file. there's even some precedent.)


> Which is a reminder to me to use
>
> https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/commit/?id=96c5865559ce
> and then probably have mount.c take advantage of that (except "object
> lifetime
> rules" is my go-to thing to harp on in software designs and changing them
> like
> that requires reinspection of assumptions)...
>
> Moving up one level from there, the "toybox" directory that my actual
> toybox git
> repos (plural) are in has 82 files/directories in it. Some of which ar test
> files, like thr.c which has:
>
> #include <pthread.h>
> #include <stdio.h>
>
> void *spin(void *data)
> {
>   unsigned i;
>
>   for (i = 0; i<4000000000; i++);
>
>   return 0;
> }
>
> int main(int argc, char *argv[])
> {
>   pthread_t tt[4];
>   void *res;
>   int i;
>
>   for (i=0; i<4; i++) pthread_create(tt+i, 0, spin, 0);
>   for (i=0; i<4; i++) pthread_join(tt[i], &res);
>
>   return ;
> }
>
> Which we were talking about a week or so back, about making top -H get
> per-thread CPU right. (The test file reminds me of the todo item.)
>
> And of course there's browser tabs:
>
>
> https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system#File_Allocation_Table
> https://rsync.samba.org/how-rsync-works.html
> https://en.wikipedia.org/wiki/Karatsuba_algorithm
>
> http://cgit.openembedded.org/meta-openembedded/tree/meta-oe/recipes-core/toybox
>
> (Sooo many browser tabs.)
>
> And buckets of old emails with yet more, todo items left in them, ala:
>
> http://lists.landley.net/pipermail/toybox-landley.net/2019-February/010196.html
>
> And open console tabs with things in them, most recently this grep output:
> toys/posix/nl.c:    xprintf("%s", line);
> toys/posix/ps.c:    printf("%s", TT.pgrep.d ? TT.pgrep.d : "\n");
> toys/posix/strings.c:          printf("%s", string);
> toys/posix/ulimit.c:          printf("%s", toybuf);
>
> Which reminds me "oh right, do a cleanup pass on the tree for the %s stuff
> we
> were talking about last email"...
>
> And another tab has open a 250 line "podcast.txt" file (not from any of the
> above directories) I've written trying to outline a walkthrough of the
> toybox
> code reminding me of concepts I need to remember to try to explain (here,
> I'll
> attach that too, of course it's _also_ unfinished and incoherent and means
> nothing to anyone but me)...
>
> It's not as simple as "check it in". As with things like "test suite" and
> "documentation", there's a huge amount of cycles needed just to _curate_
> it and
> process this compost heap into usable work product...
>
> As I said, this mess tends to be a symptom of "not enough time to clear
> backlog"
> so even little things accumulate. Heck, I've got a dozen or so
> half-composed
> email reply windows open just like this one...
>

i know, you list these every time this comes up :-)

but even if you can't solve the whole problem, anything you can do to
reduce your https://en.wikipedia.org/wiki/Bus_factor helps...


> Rob
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20190225/a44318c4/attachment-0001.htm>


More information about the Toybox mailing list