[Toybox] [PATCH] watch: flush the buffer each round.

Rob Landley rob at landley.net
Wed Mar 13 14:58:08 PDT 2024


On 3/12/24 12:24, enh wrote:
> On Tue, Mar 12, 2024 at 8:45 AM Rob Landley <rob at landley.net> wrote:
>>
>> Hello from Minneapolis: https://mstdn.jp/@landley/112078501045637288
>>
>> Still _really_ fried from my move, but at least chipping away at the email
>> backlog...
> 
> at least it's the same timezone :-)

That helps less than you'd think. (Flying to Japan and back is a day's recovery
time.)

>> My original approach to FILE * was to just always flush (ala xprintf) but there
>> were objections to that, which I said would open a can of micromanagement worms.
>> Then I changed the default buffer type to be unbuffered instead, and there were
>> objections to that, which I said would open a can of micromanagement worms...
> 
> (well, yeah, the "don't actually flush" argument was weird. but
> there's nothing inherently wrong with "flush and exit on failure", and
> for those places that _do_ have an opinion on when is a good time to
> flush, using that rather than directly using fflush would be
> reasonable?)

I thought xprintf() doing a flush was reasonable, and every single manual
flush() was a wart. Clearly my design sense is inapplicable to this area.

>> > but none of this
>> > is relevant here --- the problem here is that we're not flushing _at
>> > all_, so ToT watch is just broken.
>>
>> Micromanagement, yes.
> 
> not really. there's an obvious point for watch to flush --- "that's
> this round of output done, and i'm an interactive program". no
> heuristic is going to be right for that because it's going to depend
> on stuff like "what's the update frequency? how much attention is the
> user paying? what's the content?". `watch ls -l` with a 5s frequency
> being a second or so behind is fine. `watch date` with a 1s frequency
> not.

Naming things and cache invalidation: when dealing with a famously hard part,
choose a strategy of requiring explicit manual management. Not because the
language requires it, but because we volunteered as a performance optimization.

>> >> This code is mixing dprintf(1, ...) with printf() and fflush(stdout) in a way
>> >> that seems unlikely to end well, and I would like to think it through when I'm
>> >> not moving house. (Or have somebody whose brain isn't jello reassure me that
>> >> it's coherent.)
>> >
>> > not in the ToT copy it isn't. maybe you have local changes?
>>
>> Hmmm, yup looks like I was already fixing this command up to not use FILE * at
>> all anymore (what is this, my fourth attempt to get away from FILE * buffering
>> damage in a systematic way?) and got distracted partway through...
> 
> what's the motivation in watch? i don't see any benefit?

I don't remember at the moment (still fried), but probably just gradually
switching to "never use stdio for anything" now that it's buffering?

Back before the default of calling printf() was to produce no output without
remembering to say "simon says", you could freely mix file descriptor writes and
file pointer writes, so I didn't have to CARE if crunch_escape() in lib/ was
calling fputs() or write() behind the scenes.

But now I need to know. Now I need to retroactively go through and retrace every
single output path in every single command, and as with the
dprintf/printf/dprintf example pasted above (whether checked in or not), you
can't just stick a flush in a loop once and expect stuff to go out in the right
order.

I'm aware of the argument "use FILE * for everything, convert even tee to work
based on FILE * instead of filehandles, and then it's all consistent". Which is
the danegeld systemd/windows/apple argument: this tool is so terrible
interoperating with ANYTHING else that it must purge all other tools from its
ecosystem, so bow before it and obey the imperative to cleanse the unbelievers.

If I reacted well to that argument, I wouldn't have wound up on Linux in the
first place...

> stdio with
> explicit flush is _great_ for this use case, where the caller has a
> specific "and i'm done for now" point that _it_ knows.

When start_redraw() calls terminal_probesize() it outputs the probe sequence
through xprintf() but doesn't do a fflush(). (Before the buffer type changed, it
didn't need to. Before xprintf() lost its flush, it didn't need to.) So the
query to see if we got a response or not isn't going to get a response, so the
first screen draw will not take that size query into account when operating
across a serial terminal or qemu console. But there's a fallback path (assuming
80x25) so we may not notice the code being subtly broken.

We will find places the code is subtly broken by this for _years_...

>> > so the new patch (attached) is a better fix. but
>> > that fixes the normal case and -b for me...

I deleted my local changes and applied it. Pity, I'd improved the help text I
thought...

>> Spraying down the code with manual fflush() calls strikes me as a code smell
>> that means "this will break the next time anybody touches anything".
> 
> in general, yes. but (as i've argued above) i really think the
> opposite for something like watch.

"This entire area is a mess of contradictory exceptions."

"Yes, but here is a special case."

  George Newman : I need a drink.
  Bob Steckler : You don't drink.
  George Newman : Yeah, but I've been meaning to start.
  - UHF

> and i think that's actually true in
> general --- the cases i've had to fix have all basically been these
> kinds of interactive programs where there's a specific "done for now!"
> point in the code, and we need a flush there.

Then _add_ a flush there. In xpoll() or scan_key_getsize() or...

It still doesn't solve the "output happens out of order" problem from mixing
writes to the two output types hidden behind library functions because it never
mattered before. Possibly no library function should return with unflushed
output pending, but that's why xprintf() and xputc() and friends flushed, and
then why the default was unbuffered.

> (at the risk of encouraging you, the way to automate this would
> probably be "thou shalt not block other than on i/o without flushing

Doesn't fix the sequencing issue for the two output types.

> --- so any sleep or poll or whatever must flush". but that's going to
> flush in places where you don't care,

Superfluous flushes are part of FILE * having a nonzero buffer size.

> involve more "magic" behind the scenes

Library functions leaving invisible output pending unflushed in FILE * is magic.
NOT doing that is less magic. That's why xprintf() flushed.

> and -- as i've just argued -- not be solving any real problem.
> i'll change my mind if/when we start hitting cases where "spraying
> down" is actually true, rather than "insert one flush at the
> semantically meaningful point".)

If spraying down were possible, we could fix (most of) the problem all at once.
Instead we boil the frog for years and wind up with dozens but since we inserted
them one at a time each individual one seemed small, and then we'll _still_
break it with new changes going forward.

That's the selected paradigm. I'm just coming to terms with a perpetual bug
generator being part of the design now.

>> But I mentioned still being fried from the move, which makes me extra irritable.
>> (When did U-haul become a scam? That's new since the last time I used them...)
> 
> yeah, probably not the best time to bring up something that's always
> wound you up for some reason! (don't blame me --- like i say, i don't
> use watch. this was a user :-))

This was a regression caused by adding buffering to FILE *. It won't be the last.

>> Rob

Rob


More information about the Toybox mailing list