[Toybox] [PATCH] More use of sendfile.

Rob Landley rob at landley.net
Wed Mar 18 14:36:06 PDT 2020



On 3/18/20 12:48 PM, enh wrote:
> On Tue, Mar 17, 2020 at 8:10 PM Rob Landley <rob at landley.net> wrote:
>>
>>
>>
>> On 3/17/20 6:41 PM, enh wrote:
>>> On Mon, Mar 16, 2020 at 5:50 AM Rob Landley <rob at landley.net> wrote:
>>>>
>>>> On 3/15/20 12:44 AM, enh via Toybox wrote:
>>>>> This patch started by removing the CAT_V config option and using FLAG(),
>>>>> and switching to xsendfile() for the no-options fast path.
>>>>>
>>>>> This pointed out that the 'x' in the xsendfile family was significantly
>>>>
>>>> Hang on, I need a beverage.
>>>>
>>>> The patch touches 8 files. Right....
>>>>
>>>>> weaker than the usual guarantee, silently ignoring a variety of errors.
>>>>> It also made me realize that xsendfile() wasn't actually utilizing
>>>>> sendfile(2).
>>>>
>>>> Has it be 7 years already?
>>>>
>>>>   https://lkml.org/lkml/2013/9/17/25
>>>>
>>>> Hmmm, 2.6.33:
>>>>
>>>>   http://man7.org/linux/man-pages/man2/sendfile.2.html
>>>>
>>>> Yes it has. Ok.
>>>>
>>>>> This patch removes sendfile_len (without the 'x') which only had one
>>>>> user that wasn't checking for failure and looks like it should have used
>>>>> the 'x' form instead.
>>>>
>>>> The idea was that httpd shouldn't exit if one transaction fails, cp shouldn't
>>>> exit if one file it's copying fails... These days I've added a WARN_ONLY flag to
>>>> the xopen() flag set, but that doesn't help for things that don't take open flags.
>>>>
>>>>> It also changes xsendfile_len() to take a boolean
>>>>> argument to specify whether you want exactly that number of bytes (in
>>>>> tar, say) or up to that many bytes (as in head, which this patch also
>>>>> switches over to xsendfile).
>>>>
>>>> It's the middle of the night so I should not be reviewing this, but a quick
>>>> first pass:
>>>>
>>>> I tend to do wrappers that eliminate unnecessary arguments, but you have exactly
>>>> one user of each codepath. Sigh.
>>>>
>>>> Long english error message, "sent %lld (of %lld) bytes" is less of a lift for
>>>> non-english speakers?
>>>
>>> it's information that you otherwise can't access, and it's basically
>>> what coreutils tar outputs.
>>
>> Yes, I was proposing the shorter phrasing as being possibly easier for
>> non-english speakers.
> 
> ah, i see. because i sent the patch as an attachment rather than
> inline, i didn't even notice that you'd changed that from what i'd
> written, so i guess that's a implicit "lgtm" :-)
> 
>>>> Moving the size = down later accomplishes what?
>>>
>>> nothing much. just means we don't waste time initializing size if
>>> we're on the fast path where we won't use it, since i was in here in
>>> the first place trying to work out why our cat was slower than
>>> everyone else's. but, yeah, the switch to actually using sendfile(2)
>>> is the order of magnitude change.
>>
>> Sorry I've been slow on this, I was packing/stressing for my trip to tokyo,
>> which got pushed back a month and a half at the last minute. (In part because I
>> caught what is probably the flu. Coughing and short of breath, but no fever. So
>> I'm still in the states, but not really programming on all cylinders just now...)
> 
> yeah, no worries. none of this is a regression. (fwiw, i've been
> deliberately keeping my hexedit rebase to myself until after the
> release,

Trying to get that out this weekend, but I'm SO CLOSE to having toysh actually
boot the mkroot init script to a shell prompt. (Whether it runs the mkroot
_build_ is another question; "make airlock" would put "sh" in the $PATH and
something's likely to pick that up, but for the moment the fix is tweaking "make
airlock" to take it back out again until it's ready.)

> and my massive work-in-progress man is looking fairly decent
> at this point,

That's in pending, fair game to completely replace at any time. :)

> and i've still got the "extract readelf's guts into
> lib/elf.c and then reuse it for file/size/nm too",

Sounds like a good idea.

> and i have a pretty
> decent strace [minus -f], but i'm trying to keep to smaller stuff for
> now.

Cool!

> i do miss my "fancy" hexedit every time i'm looking at an ELF or
> zip file on the wrong machine though --- the others i can just use the
> Debian tools!)

Lemme try to get this release out this weekend.

>> I remember sendfile not sending along error messages reliably, but I tested it
>> years ago...
> 
> no, it's not sendfile(2) that's the issue --- it's that xsendfile()
> had a bunch of special cases where it would ignore errors. in
> particular, it never reports errors on stdout (!) which is why there's
> a cat test that fails if i switch cat to xsendfile().

because "cat | head" is not an error. (And at one point, short writes to stdout
were producing error messages.)

>>> a timeline might help:
>>>
>>> 1. it started with a very small change to have the simple case in
>>> cat(1) just call xsendfile.
>>> 2. though i switched over to FLAG() while i was there
>>> 3. and i applied your "you shouldn't have to know how toybox was
>>> compiled to know what options it has" from your recent patch to cp, to
>>> cat.
>>
>> It's a good thing to do, yes.
> 
> (i could send that separately, but then we'd have to rebase all this,
> and these are the least complicated parts of the change, so i'm
> assuming it's more trouble than it's worth for now.)

I've sat down to review this and gone off on some tangent when I looked at a
piece of code it referenced three times now, so smaller chunks would be easier
to commit, yes. :)

>> I do know that tar and cp shouldn't exit early if you extract a file too long
>> for a VFAT partition: it should still copy/extract the remaining files. I
>> remember a redesign of this whole area being underway at some point and probably
>> something that was lost like the various dd.c rewrites in a "got busy, got
>> distracted, flushed the tree when I came back because it had been so long  it
>> was less work to start over than reverse engineer my own work to figure out
>> where I left off"...
> 
> isn't that the difference between the perror_exit and perror_msg
> variants that i've effectively added? (that seemed both safer and more
> intention-revealing that the bunch of heuristics for when to just
> ignore errors.)

Explicit is better, yes. (I've been doing WARN_ONLY flags, but there's only
certain channels you can stick that in through. And I just fixed a bug in one
because I haven't filled out the test suite with testing nearly enough _failure_
paths. Which is hugely important and just generally huge, and is on my "big test
suite embiggening" todo heap for around the 1.0 release.)

>> Looks unfinished either way. Needs more test cases, as usual. (There are places
>> I go to some effort to make the tarball tracking recover and continue in the
>> face of other disk errors, this does not appear to be one of them. Sparse files
>> were added kinda quickly at the last minute while I was juggling too many other
>> balls...)
> 
> yeah, this was a place where GNU tar tells you that something unusual
> happened but toybox was just quietly pretending everything was fine.

It should perror_msg(), which sets errno.

> (there's a visible difference in terms of error messages with some of
> the existing tests, and this patch seems to bring us closer to
> coreutils.)

tar isn't part of coreutils, it's a separate package.

That's ANOTHER giant todo heap I've done the start of work for. I want to get
toybox into buildroot, debian, and openembedded/yocto. And _that_ means
annotating it in their build system with what packages it replaces, and figuring
out the appropriate install magic to create or not create the relevant symlinks
depending on what else is installed. (I should look how they handle busybox, but
not all of them _do_ fully. Buildroot does, and boy is it intrusive...)

Anyway... working on getting a release out.

Rob



More information about the Toybox mailing list