[Toybox] unshare/nsenter and flags

Thu May 2 19:51:24 PDT 2024

On Thu, May 2, 2024 at 6:17 PM Rob Landley <rob at landley.net> wrote:
>
> On 5/2/24 13:14, enh via Toybox wrote:
> > another googler wanted a host unshare(1) for some testing... i added
> > that, and they complained that although the docs say
> >
> >     -r Become root (map current euid/egid to 0/0, implies -U) (--map-root-user)
> >
> > it seems like -r _doesn't_ actually imply -U in practice (and they
> > seemed to have strace output to prove it).
>
> So... should it?

i think so? i have no idea about any of this, but
https://man7.org/linux/man-pages/man1/unshare.1.html says

       -r, --map-root-user
           Run the program only after the current effective user and
           group IDs have been mapped to the superuser UID and GID in
           the newly created user namespace. This makes it possible to
           conveniently gain capabilities needed to manage various
           aspects of the newly created namespaces (such as configuring
           interfaces in the network namespace or mounting filesystems
           in the mount namespace) even when run unprivileged. As a mere
           convenience feature, it does not support more sophisticated
           use cases, such as mapping multiple ranges of UIDs and GIDs.
           This option implies --setgroups=deny and --user. This option
           is equivalent to --map-user=0 --map-group=0.

which sounds like it supports the toybox documentation rather than the
toybox source?

> What did they try to do, and what did they _want_ to happen?

unshare --mount --map-root-user /bin/sh -c "mount --bind $A $B"

they looked at strace for toybox and saw

unshare(CLONE_NEWNS)                    = -1 EPERM (Operation not permitted)

but for the util-linux one they saw

unshare(CLONE_NEWNS|CLONE_NEWUSER)      = 0

> I'd compare with my debian unshare command but my install is a bit out of date.
> (According to https://endoflife.date/devuan I've still got 4 weeks of support.)
>
> Coincidentally, I just got an email yesterday morning from "The Happy Dreamhost
> Upgrade Robot" (yes really) that they're updating landley.net's web container:
>
> > We have great news! As part of our mission to support you with your digital
> > presence, we are always looking to improve your products and provide you with
> > the most advanced and powerful hardware.
> >
> > On Wednesday, May 8th we will be migrating you to a newer shared server. As
> > part of this maintenance, the operating system will be upgraded from Ubuntu
> > Bionic to Ubuntu Jammy Jellyfish 22.04.2.
> >
> > In most cases, no action is required on your part, but we've prepared some
> > documentation that will help you prepare for the upgrade to Ubuntu Jammy:
> > https://help.dreamhost.com/hc/en-us/articles/15506945971220
>
> The "22.04" means it came out two years and one month ago, and that's what
> they're migrating me TO. So, you know, I can presumably feel less bad about my
> laptop...

(to be fair, until _last week_ that was the current LTS release :-)
but, yeah, odd timing unless they deliberately like to be on the
previous LTS release! i'll throw no stones as long as i'm living so
close to the Android build server glass house though...)

> > i was assuming the code was just missing, but when i looked, i found:
> >
> > // unshare -U does not imply -r, so we cannot use [+rU]
> > if (test_r()) toys.optflags |= FLAG_U;
>
> Let's see, git annotate says that comment comes from commit 3c0be8a473c0:
>
> Author: Samuel Holland <samuel at sholland.net>
> Date:   Sun Apr 12 16:00:16 2015 -0500
>
>     unshare: fix -r
>
>     Calling unshare(2) immediately puts us in the new namespace
>     with the "overflow" user and group ID. By calling geteuid()
>     and getegid() in handle_r() after calling unshare(), we try
>     to map that to root, which Linux refuses to let us do.
>
>     What we really want to map to root is the caller's uid/gid
>     in the original namespace. So we have to save them before
>     calling unshare().
>
> Meanwhile the "implies" in the help text comes from commit fb4a241f35cf two
> months earlier:
>
> Author: Rob Landley <rob at landley.net>
> Date:   Wed Feb 18 15:19:15 2015 -0600
>
>     Patch from Isaac Dunham to add -r, fixed up so it doesn't
>     try to include two flag contexts simultaneously.
>
> So it looks like Isaac made -r imply -U and Samuel made it _not_ do so, without
> changing the help text, and I didn't notice because I'd really like to build
> domain expertise here but haven't got it. (Largely because doing container stuff
> tends to require root access, and if I'm requiring root access anyway I tend to
> just chroot, or launch a qemu instance that does NOT require root access on the
> host. It's on the todo list...)
>
> I've used toybox's unshare command a bunch of times, but not the UID remapping
> parts...
>
> > but note the unshare/nsenter sharing there --- is it a problem that i
> > have unshare enabled but not nsenter? is that expected to work?
>
> I'm happy to implement proper semantics here if I know what they _are_. What
> _should_ it do?
>
> I recently blogged (https://landley.net/notes.html#13-04-2024) about attending
> yet another container talk at txlf, but if I really want a "contain" command
> what I should probably do is dig through:
>
>   https://github.com/p8952/bocker
>   https://github.com/Fewbytes/rubber-docker
>   https://blog.lizzie.io/linux-containers-in-500-loc.html
>
> And "come up with something". It would be really nice if there was a simple
> existing syntax I could be compatible with, which is why I was vaguely looking
> at what minijail does, and https://github.com/rkt/rkt and
> https://github.com/opencontainers/runc and https://github.com/containers/crun
> and https://github.com/containerd/containerd and so on.
>
> But that's a fresh can of worms to open after I close a couple of existing ones,
> and to get to 1.0 the LFS build needs "awk" more than container support...
>
> Rob