[Toybox] strlower() bug

Ray Gardner raygard at gmail.com
Wed May 8 14:27:46 PDT 2024


On 5/8/24 06:48, Rob Landley wrote:
> On 5/6/24 17:12, Ray Gardner wrote:
>> While working on an awk implementation for toybox, I found a bug in
>> strlower(), which is used only in find.c. I've attached some tests to
>> put in find.test to reveal it. I can't put them here directly because
>> I don't think the UTF-8 names will come through. (I modelled my awk
>> tolower()/toupper() code on your strlower().)

> Your test doesn't create the files you're finding, so find is supposed to fail?

The tests were not intended to tell anything about find per se, but only
to elicit a memory violation to show the bug in strlower(). I didn't see
any easier way to do it than via the only command that uses strlower().

> Your first test doesn't barf under ASAN, and then the second one's going to fail
> because echo -n | wc says it's 258 bytes and the VFS file length limit is 255
> bytes, so there CAN'T be a file named that on Linux. (Path length != path
> component length, there's no slashes in there.)

I was not aware of the filename length limit in Linux, thanks. The first
test was just meant as a sort of "null test" to show that the command
works OK with the expanding characters if there are just a few of them.

>> The problem is in the test if the output string needs to be enlarged
>> to take an expanded lowercase:
>>     // Case conversion can expand utf8 representation, but with extra mlen
>>     // space above we should basically never need to realloc
>>     if (mlen+4 > (len = new-try)) continue;
>>
>> The mlen+4 needs to be mlen-4 to leave at least 4 bytes for the next character.

> Hmmm, possibly. I still don't understand what your test case is testing. (Just
> trying to trigger an ASAN violation with an otherwise nonsense test?)

Yes, the test is for strlower(), not for find. And I got errors from the
allocator even without ASAN.

>> As the comment indicates, it should "never" need to realloc;

[snip]
> The second is "basically never" because it requires an insane input string, but
> that's user controlled and users do crazy things, sometimes even intentionally.

That was me...

>> it takes a very long name of uppercase characters that do expand when
>> made lowercase. But the code is there to handle that very case.

[Rob analyzes why that was wrong and my test was suboptimal ...]

> Sigh, lemme come up with a test that demonstrates the fix working... the minimal
> one seems to be ./find . -iname aaaaaȺȺȺȺȺȺȺȺȺ

I should have done better on the test cases; I didn't think it would
overflow on one that short. But yours proves that maybe it could happen
on a non-insane input string?

[info about error handling etc.]

Thanks for the info about error handling etc. I'm still a newb at Linux,
though not exactly new to programming...

BTW I was a bit surprised that mentioning my awk for toybox got no reaction.


More information about the Toybox mailing list