[Toybox] [PATCH] ls new option : b
Rich Felker
dalias at libc.org
Fri Mar 11 17:37:54 PST 2016
On Fri, Mar 11, 2016 at 12:57:05AM -0600, Rob Landley wrote:
> On 03/09/2016 10:40 PM, Sameer Pradhan wrote:
> > Thanks for your comment Isaac.
> >
> > I have modified the patch by addressing to your comments.
> > Please find the modified patch as attachment.
>
> I'm already poking at ls to fix the date format posix thing mentioned
> earlier, but assuming this patch supercedes your first attempt, let's
> take a look...
>
> You're not modifying strwidth, and ls with no arguments acts like ls -C
> when we have a tty, so "ls -b" is going to get column sizes wrong.
>
> The heart of this patch is just 3 lines. I have more comments than lines.
>
> + for (b = sort[next]->name; *b; b++)
> + if (isgraph(*b)) fputc(*b, stdout);
> + else printf("\\%3hho", *b);
>
> 1) You need to say %03 or you'll have "\ 52" with a space in it.
>
> 2) You're not escaping \ so you can't distinguish between a file called "\033"
> and a file containing an escape character.
>
> 3) What does the "hh" accomplish exactly?
>
> In C99, varargs promotes anything shorter than int to int, so that before
> the rise of 64 bit systems all your arguments were the same size on the stack.
> (On 64 bit systems it _didn't_ expand that to "long" because it didn't want
> to waste stack space, and thus the horrible need to typecast (void *)0 in
> varargs and keep int/long passing straight with %d vs %ld vs %lld...)
>
> Anyway, my point is %o should work fine, is there a reason for the hh here?
This is true in C89 too; the hh prefix is not needed and is probably a
complete nop, though there may be dissenting views on what happens if
you use the hh prefix with a value that doesn't fit in unsigned char.
> 4) No UTF-8 support? I tested ls -b on a directory with japanese and arabic
> text and the ubuntu one didn't escape those.
>
> You haven't really defined what "unprintable" means with regard to UTF8.
> Are combining characters printable? (They're zero length, but they do
> stuff.) How about the direction-switching sequences?
I think it should be determined by iswprint on the decoded characters,
but being that the intent of -b is to make it easy to identify the
contents of filenames you may not be able to read or that may be
ambiguous, some people might argue that -b should show everything but
ascii as escapes. I don't think this is necessary since, if you want
that behavior, you can do LC_CTYPE=C ls -b ...
Rich
1457746674.0
More information about the Toybox
mailing list