linux-kernel - Re: [PATCH 0/7] Initial support for user namespace owned mounts

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <87wpy1dpjg.fsf@x220.int.ebiederm.org>
Date:	Wed, 15 Jul 2015 17:28:03 -0500
From:	ebiederm@...ssion.com (Eric W. Biederman)
To:	Seth Forshee <seth.forshee@...onical.com>
Cc:	Casey Schaufler <casey@...aufler-ca.com>,
	Alexander Viro <viro@...iv.linux.org.uk>,
	linux-fsdevel@...r.kernel.org,
	linux-security-module@...r.kernel.org, selinux@...ho.nsa.gov,
	Serge Hallyn <serge.hallyn@...onical.com>,
	Andy Lutomirski <luto@...capital.net>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 0/7] Initial support for user namespace owned mounts

Seth Forshee <seth.forshee@...onical.com> writes:

> On Wed, Jul 15, 2015 at 04:06:35PM -0500, Eric W. Biederman wrote:
>> Casey Schaufler <casey@...aufler-ca.com> writes:
>> 
>> > On 7/15/2015 12:46 PM, Seth Forshee wrote:
>> >> These are the first in a larger set of patches that I've been working on
>> >> (with help from Eric Biederman) to support mounting ext4 and fuse
>> >> filesystems from within user namespaces. I've pushed the full series to:
>> >>
>> >>   git://kernel.ubuntu.com/sforshee/linux.git userns-mounts
>> >>
>> >> Taking the series as a whole, the strategy is to handle as much of the
>> >> heavy lifting as possible in the vfs so the filesystems don't have to
>> >> handle weird edge cases. If you look at the full series you'll find that
>> >> the changes in ext4 to support user namespace mounts turn out to be
>> >> fairly minimal (fuse is a bit more complicated though as it must deal
>> >> with translating ids for a userspace process which is running in pid and
>> >> user namespaces).
>> >>
>> >> The patches I'm sending today lay some of the groundwork in the vfs and
>> >> related code. They fall into two broad groups:
>> >>
>> >>  1. Patches 1-2 add s_user_ns and simplify MNT_NODEV handling. These are
>> >>     pretty straightforward, and Eric has expressed interest in merging
>> >>     these patches soon. Note that patch 2 won't apply cleanly without
>> >>     Eric's noexec patches for proc and sys [1].
>> >>
>> >>  2. Patches 2-7 tighten down security for mounts with s_user_ns !=
>> >>     &init_user_ns. This includes updates to how file caps and suid are
>> >>     handled and LSM updates to ignore security labels on superblocks
>> >>     from non-init namespaces.
>> >>
>> >>     The LSM changes in particular may not be optimal, as I don't have a
>> >>     lot of familiarity with this code, so I'd be especially appreciative
>> >>     of review of these changes and suggestions on how to improve them.
>> >
>> > Lukasz Pawelczyk <l.pawelczyk@...sung.com> proposed
>> > LSM support in user namespaces ([RFC] lsm: namespace hooks)
>> > that make a whole lot more sense than just turning off
>> > the option of using labels on files. Gutting the ability
>> > to use MAC in a namespace is a step down the road of
>> > making MAC and namespaces incompatible.
>> 
>> This is not "turning off the option to use labels on files".
>> 
>> This is supporting mounting filesystems like ext4 by unprivileged users
>> and not trusting the labels they set in the same way as we trust labels
>> on filesystems mounted by privileged users.
>> 
>> The first step needs to be not trusting those labels and treating such
>> filesystems as filesystems without label support.  I hope that is Seth
>> has implemented.
>> 
>> In the long run we can do more interesting things with such filesystems
>> once the appropriate LSM policy is in place.
>
> Yes, this exactly. Right now it looks to me like the only safe thing to
> do with mounts from unprivileged users is to ignore the security labels,
> so that's what I'm trying to do with these changes. If there's some
> better thing to do, or some better way to do it, I'm more than happy to
> receive that feedback.

Ugh.

This made me realize that we have an interesting problem here.  An
unprivileged mount of tmpfs probably needs to have
s_user_ns == &init_user_ns.

Otherwise we will break security labels on tmpfs for no good reason.
ramfs and sysfs also seem to have similar concerns.

Because they have no backing store we can trust those filesystems with
security labels.  Plus for at least sysfs there is the security label
bleed through issue, that we need to make certain works.

Perhaps these filesystems with trusted backing store need to call
"sget_userns(..., &init_user_ns)".

If we don't get this right we will have significant regressions with
respect to security labels, and that is not ok.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/