linux-kernel - Re: prevent containers from turning host filesystem readonly

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20120211202803.GA19961@hallyn.com>
Date:	Sat, 11 Feb 2012 20:28:03 +0000
From:	"Serge E. Hallyn" <serge@...lyn.com>
To:	"Eric W. Biederman" <ebiederm@...ssion.com>
Cc:	Serge Hallyn <serge.hallyn@...onical.com>,
	Al Viro <viro@...IV.linux.org.uk>,
	lkml <linux-kernel@...r.kernel.org>,
	Andy Whitcroft <apw@...onical.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Dave Hansen <haveblue@...ibm.com>,
	linux-security-module@...r.kernel.org,
	Linux Containers <containers@...ts.osdl.org>,
	St?phane Graber <stgraber@...ntu.com>,
	Daniel Lezcano <daniel.lezcano@...e.fr>
Subject: Re: prevent containers from turning host filesystem readonly

Quoting Eric W. Biederman (ebiederm@...ssion.com):
> Serge Hallyn <serge.hallyn@...onical.com> writes:
> 
> > Quoting Al Viro (viro@...IV.linux.org.uk):
> >> On Fri, Feb 10, 2012 at 09:19:39PM -0600, Serge Hallyn wrote:
> >> > When a container shuts down, it likes to do 'mount -o remount,ro /'.
> >> > That sets the superblock's readonly flag, not the mount's.  So unless
> >> > the mount action fails for some reason (i.e. a file is held open on
> >> > the fs), if the container's rootfs is just a directory on the host's
> >> > fs, the host fs will be marked readonly.
> >> > 
> >> > Thanks to Dave Hansen for pointing out how simple the fix can be.  If
> >> > the devices cgroup denies the mounting task write access to the
> >> > underlying superblock (as it usually does when the container's root fs
> >> > is on a block device shared with the host), then it do_remount_sb should
> >> > deny the right to change mount flags as well.
> >> > 
> >> > This patch adds that check.
> >> > 
> >> > Note that another possibility would be to have the LSM step in.  We
> >> > can't catch this (as is) at the LSM level because security_remount_sb
> >> > doesn't get the mount flags, so we can't distinguish
> >> > 	mount -o remount,ro
> >> > from
> >> > 	mount --bind -o remount,ro.
> >> > Sending the flags to that hook would probably be a good idea in addition
> >> > to this patch, but I haven't done it here.
> >> 
> >> NAK.  This is just plain wrong - what about the filesystems that are not
> >
> > BTW, sorry - the patch clearly should've taken non-bdevs into account, but
> > I accept that wouldn't have been enough to evade a NAK.
> >
> >> bdev-backed or, as e.g. btrfs, sit on more than one device?
> >
> > btrfs is actually one of my main motivators - to quickly snapshot containers
> > with btrfs means that the containers all share one fs, but that means one
> > container can mark them all ro.
> 
> Serge let me respectfully suggest that getting the user namespace done
> will deal with this issue nicely.
> 
> In the simple case you simply won't be root so remount will just be
> denied.
> 
> When/if we allow a limited form of unprivileged mounts in a user
> namespace your user won't have mounted the filesystem so you should not
> have the privilege to call remount on the filesystem.

Hm, that's a good point.  Though note it'll require the userns code to
distinguish between the a bind remount and superblock remount.  The
last time we seriously discussed this, that wasn't even on the roadmap.
It was only going to support fully assigning the whole filesystem to
a user namespace.  In that case, the remount issue doesn't apply anyway
as the fs isn't shared with another container.

In any case, there are other workarounds, so I wasn't in a hurry to
address this - it just should be addressed eventually.  I just figured
that to bring up the issue I needed a patch :)

> I think I will have a set of patches ready for serious scrutiny in
> the next week or so.  So we aren't talking impossible pie in the sky
> distance to see this happen.

Awesome.

thanks,
-serge
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/