[Toybox] [PATCH] ls: Ensure file names are separated by 2 spaces

Sun Oct 27 21:36:14 PDT 2019

On 10/27/19 7:23 PM, Andrew Ilijic wrote:
>  I did `git format-patch -2 $HEAD` "2" for two commits. Clearly, that
> did not have the desired effect. I think I also had to copy and paste
> the patch for some reason, which also may have messed it up.

The git format-patch line should have worked, it was probably the cut and paste.

> Do you want me to redo and submit the patches?

Yes please.

>> -      if (curcol < 255) printf("%*c", curcol, ' ');
>> +      // Use dtlen and ul to not print spaces when we are at the end
>> +      if (curcol < 255 && ((dtlen - ul - 1) > 0)) printf("%*c", curcol, ' ');
>>
>> You don't need a comment on every single line of the code. That's a good commit
>> comment, but there isn't a comment explaining why we're comparing curcol with
>> 255.
>
> I spent a while looking at `if (curcol < 255)` and could not figure
> out why we are doing that comparison. I'll go look at the git history
> and try to see why it is there. If that does not work I'll reach out
> to the person that committed it.

Um, now that I look at it myself I don't quite remember either. What I was
probably _trying_ to do at the time was limit the number of columns because
we're using a paged size global scratch buffer the toybox infrastructure makes
available to every command (toybuf, declared on main.c line 31) to store the
column widths.

Toybuf is some convenient scratch space you don't have to allocate (or worry
about going over your stack size on embedded systems with tiny stacks), but it's
only a 4k buffer which in this case we're treating as an array of unsigned. See
line 303, which is more or less:

unsigned *colsizes = (unsigned *)toybuf, columns = sizeof(toybuf)/4;

The /4 is because unsigned defaults to int, which is 4 bytes according to the
LP64 standard which says char is 1, short is 2, int is 3, long is the same size
as pointer (4 bytes on 32 bit machines, 8 bytes on 64 bit machines), and long
long is 8 bytes.) So we _can't_ track more than 1024 columns unless we switch
that to a dynamic allocation (which is awkward because we don't know how big it
is when we start out, and would have to do it in 2 passes, and it probably isn't
worth it. 1000 columns is already ridiculously large for human readability,
which is what -x and -C are for.

A common thing I used to do in various commands is chop up toybuf into chunks
(ps.c is the most complicated example of this, but there's lots of examples,
such as base64.c using the first 128 bytes of toybuf as a lookup table
(initialized by base64_init() out of lib) and then the _rest_ of toybuf is the
read buffer for the file being converted.

I do that less often now that I've got proper GLOBALS() infrastructure (as
described in http://landley.net/toybox/code.html somewhere) and can just make a
TT.char[128] buffer in there about as easily. Probably what I was doing at the
time was using only the first 256 entries in toybuf (1024 bytes) to store column
sizes, and using the rest of it for something else. And then the code changed to
the current way (where not only does it have all the space, but the column limit
is inherent in how it's initialized so the later code doesn't have to check) and
I missed removing one of the old checks. You can probably yank it.

(Although these days the shared global buffer is 8232 bytes which seems a bit
excessive? I wonder what the largest struct in the union there is...)

(Sorry, I've been meaning to write some FAQ entries about this: I want to
support NOMMU systems and some fairly small embedded environments where "read
only data" is cheap and can be shared between processes, but memory can get
fragmented to the point "multiple pages of contiguous writeable space" are the
limiting factor on starting a new process, which is why I care about that stuff.
Android doesn't. :)

>> So you probably instead want something like dtlen>ul+1 as your test?
>
> Yes, I had that and then decided to change it; I don't remember why. I
> will incorportWrapping parts of the conditional is just a habit that I
> picked up along the way. I will try to break the habit and let the
> code do the work, instead of the parentheses.

It's a judgement call as to what's easier to read. The compiler doesn't care,
but I try to keep a consistent code style in toybox which is the
smallest/simplest representation. (The "coding style" section at the end of
http://landley.net/toybox/design.html talks about that. It's not a hard rule, I
wouldn't reject your patch over it. But the code looking the same makes it
easier to read, and reading code is generally harder than writing it.)

> I thought I was okay
> because, the loop/pointer offset would always be less than the length,
> `dtlen.` I forgot to consider the case where the array is empty thus
> making `dtlen` zero.

Well it's possible it wouldn't get called with an empty array (when the array is
empty it probably doesn't make it to that part of the code), but in that case it
would logically be a != test. (And I don't bother to put != 0 since it's a NOP:
if (a) and if (a != 0) do the same thing in C, that's also in the coding style
bit.) The >0 implies it can be <0, which this can't, and that's a rough edge.
And the other test isn't any more expensive and would still work if anything
changed so it _could_ overshoot somehow. (Avoiding a possible future bug.)

"The code works" is only a subset of "the code is right". :)

> ~Andrew

Rob