Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an option to disable nested user namespaces by setting limit to 1 #488

Merged
merged 2 commits into from
Jan 3, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
60 changes: 54 additions & 6 deletions bubblewrap.c
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,8 @@ static const char *opt_file_label = NULL;
static bool opt_as_pid_1;

const char *opt_chdir_path = NULL;
bool opt_assert_userns_disabled = FALSE;
bool opt_disable_userns = FALSE;
bool opt_unshare_user = FALSE;
bool opt_unshare_user_try = FALSE;
bool opt_unshare_pid = FALSE;
Expand Down Expand Up @@ -311,6 +313,8 @@ usage (int ecode, FILE *out)
" --unshare-cgroup-try Create new cgroup namespace if possible else continue by skipping it\n"
" --userns FD Use this user namespace (cannot combine with --unshare-user)\n"
" --userns2 FD After setup switch to this user namespace, only useful with --userns\n"
" --disable-userns Disable further use of user namespaces inside sandbox\n"
" --assert-userns-disabled Fail unless further use of user namespace inside sandbox is disabled\n"
" --pidns FD Use this pid namespace (as parent namespace if using --unshare-pid)\n"
" --uid UID Custom uid in the sandbox (requires --unshare-user or --userns)\n"
" --gid GID Custom gid in the sandbox (requires --unshare-user or --userns)\n"
Expand Down Expand Up @@ -1777,6 +1781,14 @@ parse_args_recurse (int *argcp,
argv++;
argc--;
}
else if (strcmp (arg, "--disable-userns") == 0)
{
opt_disable_userns = TRUE;
}
else if (strcmp (arg, "--assert-userns-disabled") == 0)
{
opt_assert_userns_disabled = TRUE;
}
else if (strcmp (arg, "--remount-ro") == 0)
{
if (argc < 2)
Expand Down Expand Up @@ -2677,6 +2689,12 @@ main (int argc,
if (opt_userns_fd != -1 && opt_unshare_user_try)
die ("--userns not compatible --unshare-user-try");

if (opt_disable_userns && !opt_unshare_user)
die ("--disable-userns requires --unshare-user");

if (opt_disable_userns && opt_userns_block_fd != -1)
die ("--disable-userns is not compatible with --userns-block-fd");

/* Technically using setns() is probably safe even in the privileged
* case, because we got passed in a file descriptor to the
* namespace, and that can only be gotten if you have ptrace
Expand Down Expand Up @@ -3155,13 +3173,34 @@ main (int argc,
if (opt_userns2_fd > 0 && setns (opt_userns2_fd, CLONE_NEWUSER) != 0)
die_with_error ("Setting userns2 failed");

if (opt_unshare_user &&
(ns_uid != opt_sandbox_uid || ns_gid != opt_sandbox_gid) &&
opt_userns_block_fd == -1)
if (opt_unshare_user && opt_userns_block_fd == -1 &&
(ns_uid != opt_sandbox_uid || ns_gid != opt_sandbox_gid ||
opt_disable_userns))
Comment on lines +3176 to +3178
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that as written here, we can get into this block even if we're setuid root. Please check that this isn't a security vulnerability?

I think it's OK, because we could already have entered this block anyway (for --unshare-user --sandbox-uid=42) even with a setuid bwrap, and the only thing we're doing here is reducing the max_user_namespaces in a new namespace, never increasing it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how this is ever a problem. We only ever hit this if --unshare-user is set, which means at this point we did CLONE_USERNS. So, even if we had privileged capabilities on the host we now don't have them in the user namespace:

On the other hand, that process has no capabilities in the parent (in the case of clone(2)) or previous (in the case of [unshare(2)] (https://man7.org/linux/man-pages/man2/unshare.2.html) and setns(2)) user namespace, even if the new namespace is created or joined by the root user (i.e., a process with user ID 0 in the root namespace).

I.e. even in the setuid case we have at this point the same privileges as the non-setuid case.

{
/* Now that devpts is mounted and we've no need for mount
permissions we can create a new userspace and map our uid
1:1 */
/* Here we create a second level userns inside the first one. This is
used for one or more of these reasons:

* The 1st level namespace has a different uid/gid than the
requested due to requirements of beeing root in the first
level due for mounting devpts (opt_needs_devpts).

* To disable user namespaces we set max_user_namespaces and then
create the second namespace so that the sandbox cannot undo this
change.
*/

if (opt_disable_userns)
{
cleanup_fd int sysctl_fd = -1;

sysctl_fd = openat (proc_fd, "sys/user/max_user_namespaces", O_WRONLY);

if (sysctl_fd < 0)
die_with_error ("cannot open /proc/sys/user/max_user_namespaces");

if (write_to_fd (sysctl_fd, "1", 1) < 0)
die_with_error ("sysctl user.max_user_namespaces = 1");
Comment on lines +3201 to +3202
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this has to be exactly 1. If it was 0, we wouldn't be able to unshare (CLONE_NEWUSER) immediately below, and if it was 2, then this protection would be ineffective.

}

if (unshare (CLONE_NEWUSER))
die_with_error ("unshare user ns");
Expand All @@ -3174,6 +3213,15 @@ main (int argc,
-1, FALSE, FALSE);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alex's version had this write_uid_gid_map() call conditional on ns_uid != opt_sandbox_uid || ns_gid != opt_sandbox_gid, but I'm not 100% sure why, and I found that keeping the conditional broke my sandboxed process's ability to connect to D-Bus - I think because my uid wasn't mapped into the container correctly. It's less diff like this, anyway.

Copy link
Collaborator

@alexlarsson alexlarsson Jan 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did that because previously in that case we would not hit this codepath so I would avoid doing what we didn't do before. But we just set up a new user namespace, so we have to do this here. So, your change is correct.

}

if (opt_disable_userns || opt_assert_userns_disabled)
{
/* Verify that we can't make a new userns again */
res = unshare (CLONE_NEWUSER);

if (res == 0)
die ("creation of new user namespaces was not disabled as requested");
}

/* All privileged ops are done now, so drop caps we don't need */
drop_privs (!is_privileged, TRUE);

Expand Down
25 changes: 25 additions & 0 deletions bwrap.xml
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,31 @@
<listitem><para>After setting up the new namespace, switch into the specified namespace. For this to work the specified namespace must be a descendant of the user namespace used for the setup, so this is only useful in combination with --userns.</para>
<para>This is useful because sometimes bubblewrap itself creates nested user namespaces (to work around some kernel issues) and --userns2 can be used to enter these.</para></listitem>
</varlistentry>
<varlistentry>
<term><option>--disable-userns</option></term>
<listitem><para>
Prevent the process in the sandbox from creating further user namespaces,
so that it cannot rearrange the filesystem namespace or do other more
complex namespace modification.
This is currently implemented by setting the
<literal>user.max_user_namespaces</literal> sysctl to 1, and then
entering a nested user namespace which is unable to raise that limit
in the outer namespace.
This option requires <option>--unshare-user</option>, and doesn't work
in the setuid version of bubblewrap.
</para></listitem>
</varlistentry>
<varlistentry>
<term><option>--assert-userns-disabled</option></term>
<listitem><para>
Confirm that the process in the sandbox has been prevented from
creating further user namespaces, but without taking any particular
action to prevent that. For example, this can be combined with
<option>--userns</option> to check that the given user namespace
has already been set up to prevent the creation of further user
namespaces.
</para></listitem>
</varlistentry>
<varlistentry>
<term><option>--pidns <arg choice="plain">FD</arg></option></term>
<listitem><para>Use an existing pid namespace instead of creating one. This is often used with --userns, because the pid namespace must be owned by the same user namespace that bwrap uses. </para>
Expand Down
2 changes: 2 additions & 0 deletions completions/bash/bwrap
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,9 @@ _bwrap() {
# Please keep sorted in LC_ALL=C order
local boolean_options="
--as-pid-1
--assert-userns-disabled
--clearenv
--disable-userns
--help
--new-session
--unshare-all
Expand Down
2 changes: 2 additions & 0 deletions completions/zsh/_bwrap
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ _bwrap_args=(

# Please sort alphabetically (in LC_ALL=C order) by option name
'--add-seccomp-fd[Load and use seccomp rules from FD]: :_guard "[0-9]#" "file descriptor to read seccomp rules from"'
'--assert-userns-disabled[Fail unless further use of user namespace inside sandbox is disabled]'
'--args[Parse NUL-separated args from FD]: :_guard "[0-9]#" "file descriptor with NUL-separated arguments"'
'--as-pid-1[Do not install a reaper process with PID=1]'
'--bind-try[Equal to --bind but ignores non-existent SRC]:source:_files:destination:_files'
Expand All @@ -41,6 +42,7 @@ _bwrap_args=(
'--dev-bind[Bind mount the host path SRC on DEST, allowing device access]:source:_files:destination:_files'
'--dev[Mount new dev on DEST]:mount point for /dev:_files -/'
"--die-with-parent[Kills with SIGKILL child process (COMMAND) when bwrap or bwrap's parent dies.]"
'--disable-userns[Disable further use of user namespaces inside sandbox]'
'--exec-label[Exec label for the sandbox]:SELinux label:_selinux_contexts'
'--file-label[File label for temporary sandbox content]:SELinux label:_selinux_contexts'
'--gid[Custom gid in the sandbox (requires --unshare-user or --userns)]: :_guard "[0-9]#" "numeric group ID"'
Expand Down
12 changes: 11 additions & 1 deletion tests/test-run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ srcd=$(cd $(dirname "$0") && pwd)

bn=$(basename "$0")

echo "1..57"
echo "1..58"

# Test help
${BWRAP} --help > help.txt
Expand Down Expand Up @@ -112,6 +112,7 @@ echo "ok exec failure doesn't include exit-code in json-status"
if test -n "${bwrap_is_suid:-}"; then
echo "ok - # SKIP no --cap-add support"
echo "ok - # SKIP no --cap-add support"
echo "ok - # SKIP no --disable-userns"
else
BWRAP_RECURSE="$BWRAP --unshare-user --uid 0 --gid 0 --cap-add ALL --bind / / --bind /proc /proc"

Expand All @@ -123,6 +124,15 @@ else
$BWRAP_RECURSE -- /proc/self/exe --unshare-all ${BWRAP_RO_HOST_ARGS} findmnt > recursive-newroot.txt
assert_file_has_content recursive-newroot.txt "/usr"
echo "ok - can pivot to new rootfs recursively"

$BWRAP --dev-bind / / -- true
! $BWRAP --assert-userns-disabled --dev-bind / / -- true
$BWRAP --unshare-user --disable-userns --dev-bind / / -- true
! $BWRAP --unshare-user --disable-userns --dev-bind / / -- $BWRAP --dev-bind / / -- true
$BWRAP --unshare-user --disable-userns --dev-bind / / -- sh -c "echo 2 > /proc/sys/user/max_user_namespaces || true; ! $BWRAP --dev-bind / / -- true"
$BWRAP --unshare-user --disable-userns --dev-bind / / -- sh -c "echo 100 > /proc/sys/user/max_user_namespaces || true; ! $BWRAP --dev-bind / / -- true"
$BWRAP --unshare-user --disable-userns --dev-bind / / -- sh -c "! $BWRAP --dev-bind / / --assert-userns-disabled -- true"
echo "ok - can disable nested userns"
fi

# Test error prefixing
Expand Down