[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <e3ee0c34-9385-09d7-94fd-961a7a2f4f6a@yandex-team.ru>
Date: Wed, 8 Feb 2017 14:45:08 +0300
From: Konstantin Khlebnikov <khlebnikov@...dex-team.ru>
To: Amir Goldstein <amir73il@...il.com>,
James Bottomley <James.Bottomley@...senpartnership.com>
Cc: Christoph Hellwig <hch@...radead.org>,
Djalal Harouni <tixxdz@...il.com>, Chris Mason <clm@...com>,
Theodore Tso <tytso@....edu>,
Josh Triplett <josh@...htriplett.org>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Andy Lutomirski <luto@...nel.org>,
Seth Forshee <seth.forshee@...onical.com>,
linux-fsdevel <linux-fsdevel@...r.kernel.org>,
linux-kernel <linux-kernel@...r.kernel.org>,
LSM List <linux-security-module@...r.kernel.org>,
Dongsu Park <dongsu@...ocode.com>,
David Herrmann <dh.herrmann@...glemail.com>,
Miklos Szeredi <mszeredi@...hat.com>,
Alban Crequy <alban.crequy@...il.com>,
Al Viro <viro@...iv.linux.org.uk>,
"Serge E. Hallyn" <serge@...lyn.com>, Phil Estes <estesp@...il.com>
Subject: Re: [RFC 1/1] shiftfs: uid/gid shifting bind mount
On 08.02.2017 09:44, Amir Goldstein wrote:
> On Wed, Feb 8, 2017 at 1:42 AM, James Bottomley
> <James.Bottomley@...senpartnership.com> wrote:
>> On Tue, 2017-02-07 at 14:25 -0800, Christoph Hellwig wrote:
>>> On Tue, Feb 07, 2017 at 11:01:29PM +0200, Amir Goldstein wrote:
>>>> Project id's are not exactly "subtree" semantic, but inheritance
>>>> semantics,
>>>> which is not the same when non empty directories get their project
>>>> id changed.
>>>> Here is a recap:
>>>> https://lwn.net/Articles/623835/
>>>
>>> Yes - but if we abuse them for containers we could refine the
>>> semantics to simply not allow change of project ids from inside
>>> containers based on say capabilities.
>>
>
> You mean something like this:
> https://lwn.net/Articles/632917/
>
> With the suggested protected_projects, projid 0 (also inside container)
> gets a special meaning, much like user 0, so we may do interesting
> things with the projid that is mapped to 0.
>
>> We can't really abuse projectid, it's part of the user namespace
>> mapping (for project quota). What we can do is have a new id that
>> behaves like it.
>>
>
> Perhaps we *can* use projid without abusing it.
> userns already maps projids, but there is no concept of "owning project"
> for a userns, nor does it make a lot of sense, because projid is not
> part of the credentials.
> But if we re-brand it as "container root projid", we can try to use it
> for defining semantics to grant unprivileged access to a subtree.
>
> The functionality you are trying to get with shiftfs mark does
> sounds a bit like "container root projid":
> - inodes with mapped projid MAY be uid/gid shifted
> - inodes with unmapped projid MAY NOT
>
> I realize this may be very raw, but its a start. If you like this
> direction we can try to develop it.
>
>> But like I said, we don't really need a ful ID, it would basically just
>> be a single bit mark to say remap or not when doing permission checks
>> against this inode. It would follow some of the project id semantics
>> (like inheritance from parent dir)
>>
>
> But a single bit would only work for single level of userns nesting won't it?
>
>
>>>> I guess we should define the semantics for the required sub-tree
>>>> marking, before we can talk about solutions.
>>>
>>> Good plan.
>>
>> So I've been thinking about how to do this without subtree marking and
>> yet retain the subtree properties similar to project id. The advantage
>> would be that if it can be done using only inode properties, then none
>> of the permission prototypes need change. The only real subtree
>> property we need is ability to bind into an unprivileged mount
>> namespace, but we already have that. The gotcha about marking inodes
>> is that they're all or nothing, so every subtree that gets access to
>> the inode inherits the mark. This means that we cannot allow a user
>> access to a marked inode without the cover of an unprivileged user
>> namespace, but I think that's fixable in the permission check
>> (basically if the inode is marked you *only* get access if you have a
>> user_ns != init_user_ns and we do the permission shifts or you have
>> user_ns == init_user_ns and you are admin capable).
>>
>
> I didn't follow, but it sounds like your proposed solutions is only
> good for single level of userns nesting.
> Do you think you can redefine it in terms of "container root projid".
>
Looks like all this started from mangling uid/gid or some other metadata.
As usual, I have to propose funny/insane solutions:
proxify filesystem with fuse and mangle everything in userspace.
Or add some kind of userspace-driver remapping/mangling into overlay,
for example using BPF script (I see it everywhere nowdays).
Powered by blists - more mailing lists