[Toybox] [PATCH] Minor formatting and cleanup in getopt.c

Rob Landley rob at landley.net
Thu Feb 29 17:11:40 PST 2024


On 2/28/24 18:23, Oliver Webb via Toybox wrote:
> I've been looking at some of the other pending commands, And found a getopt implementation from 2017 by AOSP.
> Looking at the source code, it doesn't seem unclean, nor overly large (about 100 lines).
> It does use getopt_long_only, a GNU extension of glibc, but musl has that so it will work
> on both glibc and musl. It is GNU compatible too, all test cases pass (even for TEST_HOST).

It's mostly that:

A) I'm not a shell getopt user and haven't put in the focus to come up to speed
on it yet,

B) the shell has a "getopts" (plural) builtin that I need to write, but which
this doesn't share code with. And LOTS of stuff confuses the two. (There's a
comment in this source about "BSD getopts", when this is "getopt".) Of COURSE
the semantics of the two are subtly different.

C) I'd really wanted to get it to use lib/args.c instead of having two getopt()
implementations in toybox (one from lib/ and one pulled in from libc).

Sigh. That said, I've been sitting on this for a long time without doing
anything about it, and the perfect is the enemy of the good.

The libc version isn't _bad_, it just bloats the static binary pulling in two
implementations of approximately the same code. I don't know if switching it
over to lib/args.c is even possible (the semantics are close, but I'd need one
to be a strict subset of the other), but doing so _after_ merging it wouldn't be
a much heavier lift than doing it before. (I guess that's what the regression
test suite is for?)

I should bite the bullet and promote this...

> There were a few things I changed (printf to xputsn/putchar/xprintf), and added a link
> to https://man7.org/linux/man-pages/man1/getopt.1.html, both in the attached patch.

Applied, but I've got some fixups to do on top of this. The help text for one
thing, I dunno how this command works and the help is _not_ explaining it. (What
does "parse" mean here? Details!)

The existence of the -T option is a damning indictment of the entire gnu
project. As if we needed more of those. The -s option isn't doing them any
favors either: PICK ONE. But then again, this is gnu we're talking about:

  $ printf %s one two three
  onetwothree$ echo -- one two three
  -- one two three

Not known for providing good options, are they?

> There doesn't seem like a good reason for this command to be in pending, are the calls to
> getopt_long_only a problem?

My lack of domain expertise in it has been the big blocker, plus only very
intermittent pokes about its absence.

The man page does not have a usage example. I'm guessing there's a:

  $ for i in $(getopt "$@"); do blah; done

or maybe

  $ getopt "$@" | while read i; do blah; done

Hmmm...

  $ for i in $(echo '"one two" three'); do echo $i; done
  "one
  two"
  three

In what context is this "quote reparsing" supposed to occur? How does "using
eval" as the man page suggests happen _safely_ here? See above for two attempts
at making "for i in $()" output collate them back together sanely. Sigh, what
does this sucker actually DO...

$ getopt abc one two three
 -- one two three
$ getopt abc one -a two three
 -a -- one two three
$ getopt abc one -a two -x three
getopt: invalid option -- 'x'
 -a -- one two three
$ getopt a:bc one -a two -x three
getopt: invalid option -- 'x'
 -a two -- one three
$ getopt a:bc one -a two three
 -a two -- one three
$ getopt a:bc one -a two three; echo $?
 -a two -- one three
0
$ getopt a:bc one -a two -x three; echo $?
getopt: invalid option -- 'x'
 -a two -- one three
1
$ getopt a -a
 -a --

<coulson>So that's what it does.</coulson>

Still no idea WHY. (And why does it output a leading space?) Do colons work for
longopts...

$ getopt -l walrus: abc hello -b --walrus x
 -b --walrus 'x' -- 'hello'
$ getopt -l walrus:: abc hello --walrus -b
 --walrus '' -b -- 'hello'

Ok, the help text should PROBABLY MENTION THAT then. No obvious way to tell it
to pass through unknown arguments...

$ getopt -q a ping -x; echo $?
 -- 'ping'
1
$ getopt -Q a -x
getopt: invalid option -- 'x'

And why is 'hello' quoted above but not --walrus? What's the logic...

$ getopt -l 'wal rus':: abc hello '--wal rus' -b
getopt: unrecognized option '--wal rus'
 -b -- 'hello'

Lovely. That's just... bravo. (Ok, it's quoting any string that nominally
originated after the -- argument.)

Sigh. Ok, assume there's a userbase out there that wants... "this". I don't have
to understand WHY. (But I do want to actually DOCUMENT it in the help text,
which... grrr.)

Why is this getopt_main() have a comment saying no -o means don't quote output?

$ getopt -l walrus:: abc hello --walrus -b 'one two'
 --walrus '' -b -- 'hello' 'one two'

Sigh, libc api's with random globals make me sad. (This came up with the time
zone parsing the other day too. You can pass a struct as an argument. But no,
opterr, optind, and optopt.)

What is with the magic ? here?

$ getopt -l long:,arg:: 'ab?c' command --long -b there --arg '-?'
 --long '-b' --arg '' -- 'command' 'there'
$ echo $?
1

Lovely. Nonzero exit with nothing written to stderr.

It can ALMOST use error_msg() except the name isn't in which->this.
Frustratingly close.

WHY is there a comment about BSD getopt? Was this tested on bsd targets before
being tested on Linux targets? Is a BSD workaround a big deal? (Does it affect mac?)

Hang on. why doesn't this match:

    } else if (!ch) {
      xprintf(" --%s", lopts[i].name);
      if (lopts[i].has_arg) out(optarg ? : "");
    } else {
      xprintf(" -%c", ch);
      if (optarg) out(optarg);
    }

It's checking has_arg for the longopts, but NOT checking it for the short options?

$ getopt ab::c -b
 -b  --
$ getopt -l x ab::c --x
 --x --
$ getopt ab::c -bx
 -b x --
$ getopt ab::c -b'x d'
 -b x d --

Ok, optional short arguments need collating (which I implemented in lib/args.c
already), but... why was the output quoted ABOVE but not quoted HERE?

$ getopt -l x:: ab::c --x
 --x '' --
$ getopt -l abc ab::c boo -b'x d'
 -b 'x d' -- 'boo'

Why does feeding -l an argument longer than one character make a difference to
whether the output is quoted?

$ getopt -l one:: abc thingy --one potato
 --one '' -- 'thingy' 'potato'
$ getopt -l one:: abc thingy --one=potato
 --one 'potato' -- 'thingy'

Ok, optional longopts need to be attached too. Still doesn't explain the
difference in quote output...

Rob


More information about the Toybox mailing list