[Toybox] grep getdelim() and errno

enh enh at google.com
Mon Mar 15 12:33:19 PDT 2021


for those who were curious...

it looks like scudo [https://source.android.com/devices/tech/debug/scudo]
calls `getauxval(AT_HWCAP2)` (_not_ `AT_HWCAP`) to check for MTE support at
surprising times. and getauxval() sets errno to ENOENT if there's no match,
because -1 might legitimately be the result of a successful getauxval().
(and this never happens on x86-64 because scudo knows at compile time that
it's only possible to have MTE on arm64.)

i'm not convinced scudo should be calling getauxval() that late, so i'll
prod them about that, but i think this case can be closed at least.

On Fri, Mar 12, 2021 at 12:07 PM enh <enh at google.com> wrote:

>
>
> On Fri, Mar 12, 2021 at 12:43 AM Rob Landley <rob at landley.net> wrote:
>
>> On 3/11/21 6:16 PM, enh via Toybox wrote:
>> > attached is the patch that's the reason why this morning i went through
>> the
>> > other places where toybox looks at errno without first checking that the
>> > function call failed...
>> >
>> > i think we should take this fix since POSIX allows errno to be
>> clobbered by a
>> > successful call to most functions ["The setting of errno after a
>> successful call
>> > to a function is unspecified unless the description of that function
>> specifies
>> > that errno shall not be modified"], and that certainly happens in
>> practice.
>> >
>> > there's definitely something odd going on here though. the reason i've
>> not added
>> > a test case is that i can't reproduce this at all on glibc/x86-64, nor
>> on
>> > bionic/x86-64. i can reproduce it on bionic/arm64, but not quite as
>> reliably as
>> > the person who hit this in practice while doing real work (and they
>> only see it
>> > 1/100 runs). but what we do have in common is that we're seeing errno
>> set to
>> > ENOENT (!), and i have no explanation for how getdelim() is clobbering
>> errno
>> > with that specific value. so i think we have more than one bug, but
>> this is a
>> > bug regardless, so... patch attached.
>>
>> Sigh. I'm not gonna argue with reality occurring in the field, but
>> EEEEWWWWW.
>>
>> Applied.
>>
>> Rob
>>
>> P.S. It's because the function is calling other functions internally
>> which can
>> fail and be fixed up or retried. My first guess would be down under the
>> reallocarray call on line 106 of getdelim.c,
>
>
> yeah, the person who first hit this claims it's happening from the
> calloc() in recallocarray(), but statically at least, there's no way our
> allocator can return ENOENT. (and i do have an strace that shows the ENOENT
> isn't from any system call.)
>
> sadly i'm out of devices and disk space right now so i can't go printf()
> debugging, but there's definitely something very odd going on here.
>
>
>> and does it happen under arm64 kvm
>> because that returns buckets of ENOENT every time you look at it funny.
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.landley.net/pipermail/toybox-landley.net/attachments/20210315/548c19d2/attachment-0001.htm>


More information about the Toybox mailing list