[Toybox] wc -l *

enh enh at google.com
Mon Jul 9 13:26:18 PDT 2018


ping?

(despite the lack of PATCH in the title, the fix was attached, and
makes existing tests that currently fail on the host pass.)
On Fri, Jul 6, 2018 at 7:44 PM enh <enh at google.com> wrote:
>
> (note that the test changes also fix HOST=1 which was previously failing those test cases.)
>
> On Fri, Jul 6, 2018, 13:03 enh <enh at google.com> wrote:
>>
>> TL;DR: patch attached
>>
>> (background: i've been trying to use toybox on my desktop too.)
>>
>> i was surprised to see that toybox `wc -l` doesn't columnate:
>>
>> $ ./toybox wc -l [Mm]*
>> 256 main.c
>> 69 Makefile
>> 325 total
>>
>> here's what i was expecting to see.
>>
>> $ wc -l [Mm]*
>>  256 main.c
>>   69 Makefile
>>  325 total
>>
>> i thought i'd send a patch, but:
>>
>> (a) "don't columnate unless more than one flag is set" seems
>> deliberate, but i don't understand why:
>>
>>    for (i = 0; i<4; i++) if (toys.optflags == (1<<i)) space = 0;
>>
>> (b) POSIX does say _nothing_ should be columnated:
>>
>>   By default, the standard output shall contain an entry for each
>> input file of the form:
>>
>>   "%d %d %d %s\n", <newlines>, <words>, <bytes>, <file>
>>
>>   ...
>>
>>   The output file format pseudo- printf() string differs from the
>> System V version of wc:
>>
>>   "%7d%7d%7d %s\n"
>>
>>   which produces possibly ambiguous and unparsable results for very
>> large files, as it assumes no number shall exceed six digits.
>>
>> ah, i think i see what you were trying to say... you wanted this:
>>
>> $ cat /proc/version | wc -l -
>> 1 -
>> $ cat /proc/version | wc -l
>> 1
>>
>> and `info wc` says
>>
>>   However, as a GNU extension, if only one count is printed, it is
>>   guaranteed to be printed without leading spaces.
>>
>> hmm. except i can't explain this:
>>
>> $ wc -l /etc/csh.*
>>  18 /etc/csh.cshrc
>>  11 /etc/csh.login
>>   1 /etc/csh.logout
>>  30 total
>> $ wc -l /proc/[c]*
>> 12 /proc/cgroups
>> 1 /proc/cmdline
>> 1 /proc/consoles
>> 1296 /proc/cpuinfo
>> 458 /proc/crypto
>> 1768 total
>>
>> i can't explain (a) why the first example uses a column width of 3,
>> nor (b) why the second example doesn't columnate. presumably it's
>> something to do with those files claiming size 0, though i've no idea
>> how/why it's deciding how big to make the *lines* column from the file
>> size. oh, yeah, it can assume that every character in the file is a
>> newline, and thus get an upper bound on the number of lines.
>>
>> okay, so i'm guessing the GNU heuristic is something like a two-pass
>> "stat all the files first, and use the max byte count as the the
>> column width", and /proc actually isn't a special case in their code:
>> it's a bug because their heuristic is broken for files that read
>> larger than they claim to be.
>>
>> so, anyway... it looks like you've implemented the documented GNU
>> extension, but in practice they don't actually do what they claim to
>> do. it seems like the true GNU extension is actually "there are no
>> leading spaces if only one count is printed *and* there's only one
>> file".
>>
>> ah, i think we've just misinterpreted what "only one count" means in
>> the GNU doc: they mean one *file*, not one *column*. that certainly
>> seems to match the actual behavior.
>>
>> fix attached.



More information about the Toybox mailing list