[Toybox] [BUG] du applet prints wrong output for 2GB+ files

Rob Landley rob at landley.net
Sat Oct 1 16:36:34 PDT 2016


I do have an opinion, but I was out of comission for a couple days with
eyestrain. (I fell asleep with my glasses on and rolled over them,
bending them all out of shape. Then yesterday I accidentally sat on them
and it bent them _back_. Yay? I should get a spare pair...)

I don't like the warnings. I don't like adding lots of typecasts for the
warnings either. And I don't like intptr_t because the only place it's
NOT long is windows. (It's long on linux, bsd, macos...)

This 32/64 bit problem is basically the same problem with command line
option parsing writing # fields into a long. The field needs to store
pointer and integer data, and the size that does that without gcc
flipping out and spewing warnings everywhere is a long. (I'm also
slightly concerned that LP64 defines sizeof char, short, int, and long,
but only gives a MINIMUM size for long long. In theory that could be 128
bits.)

The original design idea was that if you need more data than fits in
->extra, you malloc more data and store a pointer to it in ->extra. The
problem is, this is a corner case. For 64 bits, long is already the
right size, so any code I add is _just_ added for 32 bit platforms,
which makes me uncomfortable.

But in this case, I didn't mean for du to max out at 2 gigs on 32 bit
systems, I meant for it to max out at 2 _terabytes_ on 32 bit systems,
because 1<<32 blocks * 512 is 2 terabytes. (I did think of this, but got
it wrong. The math needs some typecasts.)

While we're at it, this should be unsigned math given that modern
optimizers are INSANE and think that signed integer math should somehow
give you a different bit pattern than unsigned integer when it
overflows. (Which is 100% wrong.)

Anybody else notice that when C++ developers got ahold of compiler
development and started rewriting C compilers in C++, they've been
trying to turn C into C++? C is a portable assembly language with
minimal complexity between the developer and what the hardware is
actually doing, and C++ is a minefield of insane corner cases that make
no sense and have to be memorized. They're using the optimizers to try
to take away C's advantage of C++ (which is hilarious because that's
just the tip of the iceberg). They are wrong and the fix is to unbreak
the darn compiler, either with -fno-being-stupid or in extreme cases -O0.

(And no I'm not switching dirtree->extra to unsigned because then
storing filehandles in it gets funky.)

Rob

On 09/30/2016 11:31 AM, enh wrote:
> those particular ones are, though it would be better to explicitly
> cast to uintptr_t (or use a data/ptr union in struct dirtree) to make
> it clear we're doing this on purpose. i assumed rob would have a
> strong opinion one way or the other though...
>
> On Fri, Sep 30, 2016 at 1:39 AM, darken <darken at darken.eu> wrote:
>> I can confirm that the patch works.
>> With the patch there are a few warnings during compilation though, safe to
>> ignore?
>>
>> See: https://gist.github.com/d4rken/710cd1ab6265e062aaa2374e7be75452
>>
>>
>>
>> 2016-09-29 20:44 GMT+02:00 enh <enh at google.com>:
>>>
>>> the attached patch fixes LP32 for me, without obviously breaking
>>> anything. a quick grep didn't show any obvious case where making
>>> dirtree::extra larger would break anything, but rob will know
>>> better...
>>>
>>> On Thu, Sep 29, 2016 at 11:37 AM, darken <darken at darken.eu> wrote:
>>>> Why isn't `TT.total` the issue?
>>>>
>>>> I couldn't find `dirtree::extra`, could you link me to that?
>>>>
>>>> A patch would consist of changing one or more `long` into `long long`?
>>>>
>>>> 2016-09-29 20:22 GMT+02:00 enh <enh at google.com>:
>>>>>
>>>>> (looks like the real problem is dirtree::extra is long.)
>>>>>
>>>>> On Thu, Sep 29, 2016 at 11:17 AM, enh <enh at google.com> wrote:
>>>>>> works for me on Nexus 9. looks like du.c's TT.total is a signed long,
>>>>>> so on LP32...
>>>>>>
>>>>>> On Thu, Sep 29, 2016 at 11:07 AM, darken <darken at darken.eu> wrote:
>>>>>>> The du applet prints a ridiculously large size if the total size
>>>>>>> exceeds
>>>>>>> 2GB.
>>>>>>>
>>>>>>> Reproduced on a Nexus5 with Android 6.0, but I don't think this is
>>>>>>> Android
>>>>>>> specific.
>>>>>>> It doesn't matter if it's multiple files or a single one.
>>>>>>> Could reproduce it by dd'ing a 2GB file full of zeros.
>>>>>>>
>>>>>>>> root at hammerhead:/cache # busybox du -hs /sdcard/2GBTestFile
>>>>>>>>
>>>>>>>> 2.5G    /sdcard/2GBTestFile
>>>>>>>>
>>>>>>>> root at hammerhead:/cache # ./toybox-armv6l du -s /sdcard/2GBTestFile
>>>>>>>>
>>>>>>>> 18446744073707954404    /sdcard/2GBTestFile
>>>>>>>>
>>>>>>>> root at hammerhead:/cache # ./toybox-armv6l du -hs /sdcard/2GBTestFile
>>>>>>>>
>>>>>>>> 16E     /sdcard/2GBTestFile
>>>>>>>
>>>>>>>
>>>>>>> This ticket has more console logs showing the issue.
>>>>>>> https://github.com/landley/toybox/issues/51
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Toybox mailing list
>>>>>>> Toybox at lists.landley.net
>>>>>>> http://lists.landley.net/listinfo.cgi/toybox-landley.net
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Elliott Hughes - http://who/enh - http://jessies.org/~enh/
>>>>>> Android native code/tools questions? Mail me/drop by/add me as a
>>>>>> reviewer.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Elliott Hughes - http://who/enh - http://jessies.org/~enh/
>>>>> Android native code/tools questions? Mail me/drop by/add me as a
>>>>> reviewer.
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Elliott Hughes - http://who/enh - http://jessies.org/~enh/
>>> Android native code/tools questions? Mail me/drop by/add me as a reviewer.
>>
>>
> 
> 
> 



More information about the Toybox mailing list