[Toybox] tar --transform again

Rob Landley rob at landley.net
Sat Feb 13 12:33:34 PST 2021


On 2/12/21 7:49 PM, enh via Toybox wrote:
> attached is "just enough" tar --transform for the kernel use case. adding the

Speaking of, the tar manual has --one-top-level[=dir] which seems to be their
version of --restrict, except they do a heuristic on the filename. (How does
that work when you zcat a file into it?)

> missing FLAGS syntax checking is easy, and sed's unescape_delimited_string() is
> easily moved into lib for reuse for fixing the PATTERN part of
> s/PATTERN/REPLACEMENT/FLAGS, but the REPLACEMENT part seems hairy enough that i
> thought i'd stop and ask before ploughing on in this direction...

First printf is debug, second should be gated on -v ?

+    fprintf(stderr, "flags=%s\n", tc->flags);
+        fprintf(stderr, "%s -> %s\n", name, tname);

Style: I'm using the glibc x ? : y syntax these days.

+  strncpy(hdr.name, tname ? tname : hname, sizeof(hdr.name));

Also, if transform() returns name when it makes no changes you can use the
result without testing and then test the free:

  if (lnkname != lnk) free(lnkname);

Sigh, what were the magic barely-documented flags again? Not in the man page...
not in the --transform option of the html manual... ah, here they are:

https://www.gnu.org/software/tar/manual/tar.html#SEC115

Right after the BROKEN LINK to the sed manual. (The <strike>aristocrats</strike>
FSF!)

So, rRsShH and the default if nothing is specified is rsh but presumably if you
specify "s" it disables "rh" and if you have R but add r after it either that
switches the R back off or throws an error?

And they have a "flags=" expression syntax which is just _gratuitous_...

For H, do they mean all entries _after_ the first, or does that include anything
with a link num > 1? (And... why do they have that at all?)

+    if ((S_ISREG(st->st_mode) && !strchr(tc->flags, 'R')) ||
+        (!link_target || (S_ISLNK(st->st_mode) && !strchr(tc->flags, 'S')))) {
+      // TODO: do multiple matches if the sed 's' g flag was supplied
+      // TODO: handle &
+      if (!regexec(&tc->re, name, 1, m, 0)) {

How does that NOT always trigger when !link_target? Even when tc->flags does
have an R you'll still trigger when !link_target?

I'm sorry, what?

  3) Convert each file name to lower case:
 	
  $ tar --transform 's/.*/\L&/' -x -f arch.tar

Really? Does that...

  $ echo potato | sed 's/./\L/g'

  $ echo potato | sed 's/.*/\L&/'
  potato

No, it doesn't look like it does. They just made it up HERE.

And -v is --show-transformed-names for this? Plus:

       --anchored
       --ignore-case
       --no-anchored
       --no-ignore-case
       --no-wildcards
       --no-wildcards-match-slash
       --wildcards
       --wildcards-match-slash

Sigh. Ok, the if (TT.transform) hunk doesn't support the crazy flags= syntax (I
think I'm ok with that). You have the TODO about separator escaping:

  $ echo abc/def | sed 's/abc\/def/ghi/'
  ghi

I try not to modify environment data when I can help it because it corrupts
ps/top output...

sed also has [0-9] as flags to only substitute Nth match.

Why weren't we just feeding it through to actual sed again? Because 'p' gets it
out of sync and the 'w' command is just a security hole waiting to happen. Sigh...

My brain hurts.

Rob


More information about the Toybox mailing list