[Toybox] more tar madness

Rob Landley rob at landley.net
Sun Oct 9 17:44:04 PDT 2022


On 10/9/22 05:04, Rob Landley wrote:
> So s#^.+/##x tells sed to match from the start of string, any character (period
> wildcard), one or more, continue until the LAST (greedy!) forward slash. And the
> string ended with a forward slash. So yes, that regex should match the entire
> string:
> 
>   $ echo hello | sed -E 's#^.+/##'
>   hello
>   $ echo hello/ | sed -E 's#^.+/##'
> 
>   $
> 
> That's debian host sed, not mine: match all and replace with nothing. It's
> returning a zero length string because that's what it was ASKED to do. And
> toybox tar is saying "I don't know what to do with a zero length result", so
> error_exit().
> 
> What is SUPPOSED to happen here? Hmmm...
> 
> $ diff -u <(tar c /lib/modules/$(uname -r)/kernel/fs 2>/dev/null | tar tv
> --xform 's#^.+/##x')  <(tar c /lib/modules/$(uname -r)/kernel/fs 2>/dev/null |
> tar tv)
> 
> Nothing. Absolutely nothing. The --xform is ignored. When they get a zero length
> result, they use the original untransformed string instead of treating it as an
> error.

Oh right, gnu tar doesn't show the transformed names unless you explicitly say
--actually-show-the-xform-command-results-rather-than-ignoring-them

-drwxr-xr-x root/root         0 2022-03-09 12:30 fs/
-drwxr-xr-x root/root         0 2022-03-09 12:30 ubifs/
--rw-r--r-- root/root    826475 2022-03-07 15:13 ubifs.ko
-drwxr-xr-x root/root         0 2022-03-09 12:30 xfs/
--rw-r--r-- root/root   2792843 2022-03-07 15:13 xfs.ko

Alright, what's the special magic here... if I tell it to replace with M the
result is Mfs/ and such. Strip off the ^ makes no difference. --xform
's#.+/#&Z#x' results in lib/modules/4.19.0-19-amd64/kernel/Zfs/

Looks like when the match would eat the entire string, it redoes the match with
one less length. Does not care what it's replacing with (even when I'm putting
the matched string BACK), it just doesn't allow a match to grab the entire span.

Huh. Except... What happens if a literal match with no wildcards tries to do so?

$ tar c yank | tar tv --show-transformed-names --xform 's/yank/poing/'
-rw-r--r-- landley/landley 365 2019-05-04 17:41 poing
$ tar c yank | tar tv --show-transformed-names --xform 's/yank//'
tar: Substituting `.' for empty member name
-rw-r--r-- landley/landley 365 2019-05-04 17:41 .

It's got #(%&#(#& magic for this! So why... wha...

What is the rule here? When do we add an implicit !$ (however the regex syntax
of that is supposed to work) and when do we NOT do that?

$ tar c yank | tar tv --show-transformed-names --xform 's/y.nk//'
tar: Substituting `.' for empty member name
-rw-r--r-- landley/landley 365 2019-05-04 17:41 .
landley at driftwood:~$ tar c yank | tar tv --show-transformed-names --xform
's/y.*nk//'
tar: Substituting `.' for empty member name
-rw-r--r-- landley/landley 365 2019-05-04 17:41 .
$ mv yank yaank
$ tar c yaank | tar tv --show-transformed-names --xform 's/y.*nk//'
tar: Substituting `.' for empty member name
-rw-r--r-- landley/landley 365 2019-05-04 17:41 .

Doesn't seem to be triggered by the pattern... is it because it ate a subdirectory?

$ tar c www/notes.html | tar tv --show-transformed-names --xform 's/.*//'
tar: Substituting `.' for empty member name
lrwxrwxrwx landley/landley   0 2021-01-14 02:46 . ->

Seriously?

I need to poke at this to figure out what eldrich abomination the gnu loons have
implemented THIS time...

$ tar c www/notes-2022.html | tar tv --show-transformed-names --xform 's at .*/@@'
-rw-r--r-- landley/landley 952956 2022-10-09 03:30 notes-2022.html

Are they SPECIAL CASING FORWARD SLASH AT THE END? Really? Is that it?

$ tar c www/notes-2022.html | tar tv --show-transformed-names --xform 's at .*l@@'
tar: Substituting `.' for empty member name
-rw-r--r-- landley/landley 952956 2022-10-09 03:30 .

I need a walk and a scream.

Rob


More information about the Toybox mailing list