linux-kernel - Re: [PATCH 0/1] shiftfs: uid/gid shifting filesystem

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Wed, 01 Jun 2016 12:41:00 -0400
From:	James Bottomley <James.Bottomley@...senPartnership.com>
To:	Michał Zegan <webczat_200@...zta.onet.pl>,
	Djalal Harouni <tixxdz@...il.com>, Chris Mason <clm@...com>,
	tytso@....edu, Serge Hallyn <serge.hallyn@...onical.com>,
	Josh Triplett <josh@...htriplett.org>,
	"Eric W. Biederman" <ebiederm@...ssion.com>,
	Andy Lutomirski <luto@...nel.org>,
	Seth Forshee <seth.forshee@...onical.com>,
	linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
	linux-security-module@...r.kernel.org,
	Dongsu Park <dongsu@...ocode.com>,
	David Herrmann <dh.herrmann@...glemail.com>,
	Miklos Szeredi <mszeredi@...hat.com>,
	Alban Crequy <alban.crequy@...il.com>,
	Al Viro <viro@...IV.linux.org.uk>
Subject: Re: [PATCH 0/1] shiftfs: uid/gid shifting filesystem

On Wed, 2016-06-01 at 18:21 +0200, Michał Zegan wrote:
> As I sent a reply in a ... wrong way, I do it again. my question was:
> Why isn't it done at the vfs layer when you mount the fs in different
> userns, instead of using a separate filesystem for it?

Well, that is what this patch does:

http://thread.gmane.org/gmane.linux.kernel/2214882

However, the reason it doesn't work for me is that I want to be able to
unpack the image into a subdirectory (so I'm not dedicating a whole
filesystem for this).  This is primarily for a docker hack IBM is
working on to allow each container instance to use a separate uid/gid
range, so I need something that behaves much more like a bind mount.

>  I believe it could be useful to be able to mount all filesystems in 
> userns with autoshifted uids, although I do not know security 
> implications for that usage.

As long as you don't need to subdivide the volume, it works nicely. 
 However, from a security point of view, that entire volume is now
effectively freely writeable by anyone who can set up a userns.  If you
follow the shiftfs route, you can break off writeable subdirectories
for each namespace shift, but they can't cross over into writing
subdirectories that belong to other user namespaces (assuming the uids
are fully segregated).

James


> W dniu 01.06.2016 o 02:29, James Bottomley pisze:
> > [This patch is updated for the new VFS APIs in 4.7-rc1; it's also
> > been
> > updated as Serge has been hammering on it]
> > 
> > My use case for this is that I run a lot of unprivileged
> > architectural
> > emulation containers on my system using user namespaces.  Details
> > here:
> > 
> > http://blog.hansenpartnership.com/unprivileged-build-containers/
> > 
> > They're mostly for building non-x86 stuff (like aarch64 and arm
> > secure
> > boot and mips images).  For builds, I have all the environments in
> > my
> > home directory with downshifted uids; however, sometimes I need to
> > use
> > them to administer real images that run on systems, meaning the
> > uids
> > are the usual privileged ones not the downshifted ones.  The only
> > current choice I have is to start the emulation as root so the
> > uid/gids
> > match.  The reason for this filesystem is to use my standard
> > unprivileged containers to maintain these images.  The way I do
> > this is
> > crack the image with a loop and then shift the uids before bringing
> > up
> > the container.  I usually loop mount into /var/tmp/images/, so it's
> > owned by real root there:
> > 
> > jarvis:~ # ls -l /var/tmp/images/mips|head -4
> > total 0
> > drwxr-xr-x 1 root root 8192 May 12 08:33 bin
> > drwxr-xr-x 1 root root    6 May 12 08:33 boot
> > drwxr-xr-x 1 root root  167 May 12 08:33 dev
> > 
> > And I usually run my build containers with a uid_map of 
> > 
> >          0     100000       1000
> >       1000       1000          1
> >      65534     101000          1
> > 
> > (maps 0-999 shifted, then shifts nobody to 1000 and keeps my uid
> > [1000]
> > fixed so I can mount my home directory into the namespace) and
> > something similar with gid_map. So I shift mount the mips image
> > with
> > 
> > mount -t shiftfs -o
> > uidmap=0:100000:1000,uidmap=65534:101000:1,gidmap=0:100000:100,gidm
> > ap=1
> > 01:100101:899,gidmap=65533:101000:2 /var/tmp/images/mips
> > /home/jejb/containers/mips
> > 
> > and I now see it as
> > 
> > jejb@...vis:~> ls -l containers/mips|head -4
> > total 0
> > drwxr-xr-x 1 100000 100000 8192 May 12 08:33 bin/
> > drwxr-xr-x 1 100000 100000    6 May 12 08:33 boot/
> > drwxr-xr-x 1 100000 100000  167 May 12 08:33 dev/
> > 
> > Like my usual unprivileged build roots and I can now use an
> > unprivileged container to enter and administer the image.
> > 
> > It seems like a lot of container systems need to do something
> > similar
> > when they try and provide unprivileged access to standard images. 
> >  Right at the moment, the security mechanism only allows root in
> > the
> > host to use this, but it's not impossible to come up with a scheme
> > for
> > marking trees that can safely be shift mounted by unprivileged user
> > namespaces.
> > 
> > James
> > 
> > ---
> > 
> >  fs/Kconfig                 |   8 +
> >  fs/Makefile                |   1 +
> >  fs/shiftfs.c               | 877
> > +++++++++++++++++++++++++++++++++++++++++++++
> >  include/uapi/linux/magic.h |   2 +
> >  4 files changed, 888 insertions(+)
> > 
> 

Download attachment "signature.asc" of type "application/pgp-signature" (820 bytes)