[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5871495633F38949900D2BF2DC04883E5632BD@G08CNEXMBPEKD02.g08.fujitsu.local>
Date: Mon, 14 Jul 2014 09:32:39 +0000
From: "chenhanxiao@...fujitsu.com" <chenhanxiao@...fujitsu.com>
To: "Eric W. Biederman" <ebiederm@...ssion.com>,
"Serge E. Hallyn" <serge@...lyn.com>,
"'Daniel P. Berrange (berrange@...hat.com)'" <berrange@...hat.com>
CC: Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
"containers@...ts.linux-foundation.org"
<containers@...ts.linux-foundation.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: RE: Could not mount sysfs when enable userns but disable netns
> -----Original Message-----
> From: Eric W. Biederman [mailto:ebiederm@...ssion.com]
> Sent: Saturday, July 12, 2014 12:29 AM
> To: Serge E. Hallyn
> Cc: Chen, Hanxiao/陈 晗霄; Serge Hallyn (serge.hallyn@...ntu.com); Greg
> Kroah-Hartman; containers@...ts.linux-foundation.org;
> linux-kernel@...r.kernel.org
> Subject: Re: Could not mount sysfs when enable userns but disable netns
>
> "Serge E. Hallyn" <serge@...lyn.com> writes:
>
> > Quoting chenhanxiao@...fujitsu.com (chenhanxiao@...fujitsu.com):
> >> Hello,
> >>
> >> How to reproduce:
> >> 1. Prepare a container, enable userns and disable netns
> >> 2. use libvirt-lxc to start a container
> >> 3. libvirt could not mount sysfs then failed to start.
> >>
> >> Then I found that
> >> commit 7dc5dbc879bd0779924b5132a48b731a0bc04a1e says:
> >> "Don't allow mounting sysfs unless the caller has CAP_SYS_ADMIN rights
> >> over the net namespace."
> >>
> >> But why should we check sysfs mouont permission over net namespace?
> >> We've already checked CAP_SYS_ADMIN though.
>
> We already checked capable(CAP_SYS_ADMIN) and it failed.
But on my machine, capable(CAP_SYS_ADMIN) passed
but failed in kobj_ns_current_may_mount.
I added some printks in sysfs_mount:
if (!(flags & MS_KERNMOUNT)) {
- if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type))
+ if (!capable(CAP_SYS_ADMIN) && !fs_fully_visible(fs_type)) {
+ printk(KERN_WARNING "Failed in capable\n");
return ERR_PTR(-EPERM);
+ }
- if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET))
+ if (!kobj_ns_current_may_mount(KOBJ_NS_TYPE_NET)) {
+ printk(KERN_WARNING "Failed in kobj_ns_current_may_mount\n");
return ERR_PTR(-EPERM);
+ }
And found:
Jul 14 09:55:26 localhost systemd: Starting Container lxc-chx.
Jul 14 09:55:26 localhost systemd-machined: New machine lxc-chx.
Jul 14 09:55:26 localhost systemd: Started Container lxc-chx.
Jul 14 09:55:26 localhost kernel: [ 784.044709] Failed in kobj_ns_current_may_mount
Jul 14 09:55:26 localhost systemd-machined: Machine lxc-chx terminated.
>
> >> What the relationship between sysfs and net namespace,
> >> or this check is a little redundant?
>
> You want a bind mount not a new fresh mount.
>
Yes, we need to modify libvirt's codes to deal with sysfs
when enable userns but disable netns.
Thanks,
- Chen
> When looking at how evil actors could abuse things it turned out that in
> some circumstances the root user (before a user namespace is created)
> needs to control the policy on which filesystems may be mounted. There
> are files in sysfs and in proc that you never want to see in a chroot
> jail, as they just create more surface area to attack.
>
> The only reason for creating a new fresh mount of sysfs is to get access
> to /sys/class/net. So to keep things simple we restrict creation of
> that mount to cases where the mounter has permisions over the network
> namespace, and cases where nothing interesing is mounted on top of
> sysfs.
>
> If a new /sys/class/net is not needed it is possible to bind mount the
> existing copy of sysfs to the new location without loss of
> functionality.
>
> > It is not redundant. The whole point is that after clone(CLONE_NEWUSER)
> > you get a newly filled set of capabilities. But you should not have
> > privileges over the host's network namesapce. After you unshare a new
> > network namespace, you *should* have privilege over it. So the fact
> > that we've already check CAP_SYS_ADMIN means nothing, because the
> > capabilities need to be targeted.
>
> Exactly the tests are failing because the caller is not the global root
> and so the code is properly failing the permission checks.
>
> Eric
Powered by blists - more mailing lists