[Toybox] Looking at nl
Rob Landley
rob at landley.net
Sun May 26 18:34:36 PDT 2013
On 05/25/2013 06:36:49 PM, idunham at lavabit.com wrote:
> I thought I'd look at what nl takes, since it's not much more than
> looping over some lines, incrementing, and formatting the output.
> The POSIX reference page is:
> http://pubs.opengroup.org/onlinepubs/9699919799/utilities/nl.html
I suspect that page's author did not speak english as their first
language.
I also suspect nobody's looked at this command in 15 years, here's the
Open Group Base specifications issue 5 version (SUSv4 is issue 7, SUSv3
was issue 2, this was 1997.)
http://pubs.opengroup.org/onlinepubs/7908799/xcu/nl.html
> There are a few little "fun" details, though.
There usually are. :)
> -Fixed length strings.
> -n should be one of 3 2-char strings, and -d should be at most 2
> chars.
> Is there a way to indicate this in NEWTOY or GLOBALS, or should we
> just
> check for matches in the main loop?
Just do it in main. There's nothing in lib/args.c to do that now, and I
don't think it's generic enough to add. (Also, -n is checking for 3
specific strings, the error is just the 'else' case even if it's length
2.)
Also, -d "" would be an error case, so "<2" isn't useful there either.
That's length 1 has behavior, length 2 has behavior, else.
> I'm half tempted to just ignore length and assume the default if an
> invalid
> arg is specified. So -n lna would be treated as -n ln, and -n asd
> would be
> treated as -n rn. But that's probably a little too liberal in
> accepting bad flags...
Up to you. I suspect that most of the nl complexity is vestigial.
Haven't encountered anything that uses more than just "number lines",
but then I wasn't looking...
> -Variable format specifiers:
> -w 5 means printing roughly this:
> printf("%5d%s%s", linecount, sep, toybuf)
> But -n ln -w 7 makes it %-7d...
So all these options are to control the alignment and indentation of
the line numbers.
> -When to start a new page.
> A page contains a header (delimiter occurs 3x),
> body (delimiter occurs 2x), and footer (delimiter occurs once).
>
> POSIX specifies that line numbering shall be reset at the start of
> each
> logical page, and that "Unless otherwise specified, nl shall assume
> the
> text being read is in a single logical page body."
> The obvious approach to me is to say that if you go to a new page
> section
> no lower on the page, you've started a new page.
> So
> \:\:
> Text of Page 1
> \:\:
> More text.
Makes sense to me.
> is 2 page body sections, numbered thus:
> 1 Text of Page 1
>
> 1 More text.
>
> However, GNU nl assumes that a new page starts with a new header, and
> treats two consecutive body sections as one...except it prints a blank
> line between them as a section separation should. So the sample above
> becomes:
> 1 Text of Page
>
> 2 More text.
I've hit a number of things gnu gets wrong. :)
> On other topics...
> Would initializing GLOBALS to the defaults be something sane to allow,
> or would it complicate the build system too much?
> Example:
> USE_NL(NEWTOY(wc, "1b:d:f:h:i#l#n:ps:v#w#", TOYBOX_USR | TOYBOX_BIN))
> ..
> GLOBALS(
> char *btype = "t";
> char *delim = "\:";
> char *ftype = "n";
> char *htype = "n";
> long incr = 1;
> long maxblank = 1;
> char *fmt = "rn";
> char *sep = "\t";
> long startnum = 1;
> long width = 6;
> )
GLOBALS is actually a union, which starts zeroed. Initializing it to
command-specific values would mean it wasn't zeroed for other commands.
I.E. do it in main.
Rob
More information about the Toybox
mailing list