linux-kernel - Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super blocks between user namespaces

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20160517235834.GA104031@ubuntu-hedt>
Date:	Tue, 17 May 2016 18:58:34 -0500
From:	Seth Forshee <seth.forshee@...onical.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Alexander Viro <viro@...iv.linux.org.uk>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Jeff Layton <jlayton@...chiereds.net>,
	"J. Bruce Fields" <bfields@...ldses.org>,
	Tejun Heo <tj@...nel.org>, Li Zefan <lizefan@...wei.com>,
	Johannes Weiner <hannes@...xchg.org>,
	Serge Hallyn <serge.hallyn@...onical.com>,
	Richard Weinberger <richard.weinberger@...il.com>,
	Austin S Hemmelgarn <ahferroin7@...il.com>,
	Miklos Szeredi <mszeredi@...hat.com>,
	Pavel Tikhomirov <ptikhomirov@...tuozzo.com>,
	linux-kernel@...r.kernel.org, linux-bcache@...r.kernel.org,
	dm-devel@...hat.com, linux-raid@...r.kernel.org,
	linux-mtd@...ts.infradead.org, linux-fsdevel@...r.kernel.org,
	fuse-devel@...ts.sourceforge.net,
	linux-security-module@...r.kernel.org, selinux@...ho.nsa.gov,
	cgroups@...r.kernel.org
Subject: Re: [PATCH v4 03/21] fs: Allow sysfs and cgroupfs to share super
 blocks between user namespaces

On Tue, May 17, 2016 at 05:39:33PM -0500, Eric W. Biederman wrote:
> Seth Forshee <seth.forshee@...onical.com> writes:
> 
> > Both of these filesystems already have use cases for mounting the
> > same super block from multiple user namespaces. For sysfs this
> > happens when using criu for snapshotting a container, where sysfs
> > is mnounted in the containers network ns but the hosts user ns.
> > The cgroup filesystem shares the same super block for all mounts
> > of the same hierarchy regardless of the namespace.
> >
> > As a result, the restriction on mounting a super block from a
> > single user namespace creates regressions for existing uses of
> > these filesystems. For these specific filesystems this
> > restriction isn't really necessary since the backing store is
> > objects in kernel memory and thus the ids assigned from inodes
> > is not subject to translation relative to s_user_ns.
> >
> > Add a new filesystem flag, FS_USERNS_SHARE_SB, which when set
> > causes sget_userns() to skip the check of s_user_ns. Set this
> > flag for the sysfs and cgroup filesystems to fix the
> > regressions.
> 
> So this one needs to be sget_userns(..., &init_user_ns, ...).
> And not a new special case.

This is actually what I wanted to do, but based on a previous discussion
where I had suggested doing this (for a different reason) I came away
thinking you did not want it that way. So I'm happy with that change.

But if we do that it violates some of the assumptions of the patch to
rework MNT_NODEV on your testing branch (and also those behind patch 2
in this series). Something will need to be changed there to prevent a
regression in mount behavior when a user ns tries to mount without
MNT_NODEV when the mount inherited from its parent has it set.

> Apologies for not catching this earlier.

Actually this is a more recent patch, so you possibly hadn't seen it
before.

> I am looking at folding all of this into the patch that introduces
> sget_userns so that even bisects won't have regresssions.

That's fine with me.

Thanks,
Seth