[Toybox] more tar madness

Rob Landley rob at landley.net
Sun Oct 9 03:04:24 PDT 2022


On 10/8/22 03:11, Rob Landley wrote:
>> also, allegedly (by which i only mean "i haven't confirmed myself") that new use
>> of --xform gets you a "bad xform" with toybox tar...
>> 
>> tar --directory={intermediates_dir} --wildcards --xform='s#^.+/##x' -xf
>> {base_modules_archive} '*.ko'

I note that --wildcards sucks even more because --wildcards-match-slash exists.
Which is just fnmatch(FNM_PATHNAME) but I still have to work out what option
goes with what and do test cases...

> Because I haven't added the xform flag support yet, and adding 'x' is part of
> that. (People keep sending me bug reports...)

Hmmm, no that isn't it, you added s///x in commit 50d8ed89b1e0. (Still in my
todo because _I_ hadn't done it, but it's already in.)

Let's see, --directory is -C, -xf filename, and it's gonna fail to find anything
because --wildcards isn't implemented yet so I'm confused by your example? (Fail
because '*.ko' isn't likely to be a literal filename so it wouldn't find
anything, and --wildcards will barf as an unrecognized option:

  $ toybox tar --wildcards
  tar: Unknown option 'wildcards' (see "tar --help")

So let's see...

$ tar c /lib/modules/$(uname -r)/kernel/fs | toybox tar tv --xform 's#^.+/##x'
tar: Removing leading `/' from member names
tar: bad xform

Yes, I'm reproducing that failure. What's going on here... Ok, sed is returning
length zero output to tar. (With the prefixed 00000000 and everything.)

In sed, it regcomps '^.+/' with flags 1... which is REG_EXTENDED. Good. Reads
lib/modules/4.19.0-19-amd64/kernel/fs/ which is 38 bytes, calls regexec0() which
fills the start/end info into the first regmatch structure and adds the
REG_STARTEND flag and calls regexec() which returns 0 (success)... but the first
structure still has 0 and 38, so it's treated as a match of the whole thing?

What is the regex actually trying to _do_ here? man 7 regex... extended regex
syntax, + "1 or more" the way * is "0 or more".

So s#^.+/##x tells sed to match from the start of string, any character (period
wildcard), one or more, continue until the LAST (greedy!) forward slash. And the
string ended with a forward slash. So yes, that regex should match the entire
string:

  $ echo hello | sed -E 's#^.+/##'
  hello
  $ echo hello/ | sed -E 's#^.+/##'

  $

That's debian host sed, not mine: match all and replace with nothing. It's
returning a zero length string because that's what it was ASKED to do. And
toybox tar is saying "I don't know what to do with a zero length result", so
error_exit().

What is SUPPOSED to happen here? Hmmm...

$ diff -u <(tar c /lib/modules/$(uname -r)/kernel/fs 2>/dev/null | tar tv
--xform 's#^.+/##x')  <(tar c /lib/modules/$(uname -r)/kernel/fs 2>/dev/null |
tar tv)

Nothing. Absolutely nothing. The --xform is ignored. When they get a zero length
result, they use the original untransformed string instead of treating it as an
error.

Um, valid I suppose? What do you want me to do here?

Rob


More information about the Toybox mailing list