[Toybox] [PATCH] ls new option : b

Rich Felker dalias at libc.org
Fri Mar 11 17:37:54 PST 2016


On Fri, Mar 11, 2016 at 12:57:05AM -0600, Rob Landley wrote:
> On 03/09/2016 10:40 PM, Sameer Pradhan wrote:
> > Thanks for your comment Isaac.
> > 
> > I have modified the patch by addressing to your comments.
> > Please find the modified patch as attachment.
> 
> I'm already poking at ls to fix the date format posix thing mentioned
> earlier, but assuming this patch supercedes your first attempt, let's
> take a look...
> 
> You're not modifying strwidth, and ls with no arguments acts like ls -C
> when we have a tty, so "ls -b" is going to get column sizes wrong.
> 
> The heart of this patch is just 3 lines. I have more comments than lines.
> 
> +        for (b = sort[next]->name; *b; b++)
> +            if (isgraph(*b)) fputc(*b, stdout);
> +         else printf("\\%3hho", *b);
> 
> 1) You need to say %03 or you'll have "\ 52" with a space in it.
> 
> 2) You're not escaping \ so you can't distinguish between a file called "\033"
> and a file containing an escape character.
> 
> 3) What does the "hh" accomplish exactly?
> 
> In C99, varargs promotes anything shorter than int to int, so that before
> the rise of 64 bit systems all your arguments were the same size on the stack.
> (On 64 bit systems it _didn't_ expand that to "long" because it didn't want
> to waste stack space, and thus the horrible need to typecast (void *)0 in
> varargs and keep int/long passing straight with %d vs %ld vs %lld...)
> 
> Anyway, my point is %o should work fine, is there a reason for the hh here?

This is true in C89 too; the hh prefix is not needed and is probably a
complete nop, though there may be dissenting views on what happens if
you use the hh prefix with a value that doesn't fit in unsigned char.

> 4) No UTF-8 support? I tested ls -b on a directory with japanese and arabic
> text and the ubuntu one didn't escape those.
> 
> You haven't really defined what "unprintable" means with regard to UTF8.
> Are combining characters printable? (They're zero length, but they do
> stuff.) How about the direction-switching sequences?

I think it should be determined by iswprint on the decoded characters,
but being that the intent of -b is to make it easy to identify the
contents of filenames you may not be able to read or that may be
ambiguous, some people might argue that -b should show everything but
ascii as escapes. I don't think this is necessary since, if you want
that behavior, you can do LC_CTYPE=C ls -b ...

Rich

 1457746674.0


More information about the Toybox mailing list