[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Date: Thu, 7 Jul 2022 10:34:09 -0400
From: Stefan Berger <stefanb@...ux.ibm.com>
To: "Serge E. Hallyn" <serge@...lyn.com>
Cc: linux-integrity@...r.kernel.org, zohar@...ux.ibm.com,
christian.brauner@...ntu.com, containers@...ts.linux.dev,
dmitry.kasatkin@...il.com, ebiederm@...ssion.com,
krzysztof.struczynski@...wei.com, roberto.sassu@...wei.com,
mpeters@...hat.com, lhinds@...hat.com, lsturman@...hat.com,
puiterwi@...hat.com, jejb@...ux.ibm.com, jamjoom@...ibm.com,
linux-kernel@...r.kernel.org, paul@...l-moore.com, rgb@...hat.com,
linux-security-module@...r.kernel.org, jmorris@...ei.org,
jpenumak@...hat.com, Christian Brauner <brauner@...nel.org>,
James Bottomley <James.Bottomley@...senPartnership.com>
Subject: Re: [PATCH v12 02/26] securityfs: Extend securityfs with namespacing
support
On 5/20/22 22:23, Serge E. Hallyn wrote:
> On Wed, Apr 20, 2022 at 10:06:09AM -0400, Stefan Berger wrote:
>> Enable multiple instances of securityfs by keying each instance with a
>> pointer to the user namespace it belongs to.
>>
>> Since we do not need the pinning of the filesystem for the virtualization
>> case, limit the usage of simple_pin_fs() and simpe_release_fs() to the
>> case when the init_user_ns is active. This simplifies the cleanup for the
>> virtualization case where usage of securityfs_remove() to free dentries
>> is therefore not needed anymore.
>>
>> For the initial securityfs, i.e. the one mounted in the host userns mount,
>> nothing changes. The rules for securityfs_remove() are as before and it is
>> still paired with securityfs_create(). Specifically, a file created via
>> securityfs_create_dentry() in the initial securityfs mount still needs to
>> be removed by a call to securityfs_remove(). Creating a new dentry in the
>> initial securityfs mount still pins the filesystem like it always did.
>> Consequently, the initial securityfs mount is not destroyed on
>> umount/shutdown as long as at least one user of it still has dentries that
>> it hasn't removed with a call to securityfs_remove().
>>
>> Prevent mounting of an instance of securityfs in another user namespace
>> than it belongs to. Also, prevent accesses to files and directories by
>> a user namespace that is neither the user namespace it belongs to
>> nor an ancestor of the user namespace that the instance of securityfs
>> belongs to. Do not prevent access if securityfs was bind-mounted and
>> therefore the init_user_ns is the owning user namespace.
>>
>> Suggested-by: Christian Brauner <brauner@...nel.org>
>> Signed-off-by: Stefan Berger <stefanb@...ux.ibm.com>
>> Signed-off-by: James Bottomley <James.Bottomley@...senPartnership.com>
>>
>> ---
>> v11:
>> - Formatted comment's first line to be '/*'
>> ---
>> security/inode.c | 73 ++++++++++++++++++++++++++++++++++++++++--------
>> 1 file changed, 62 insertions(+), 11 deletions(-)
>>
>> diff --git a/security/inode.c b/security/inode.c
>> index 13e6780c4444..84c9396792a9 100644
>> --- a/security/inode.c
>> +++ b/security/inode.c
>> @@ -21,9 +21,38 @@
>> #include <linux/security.h>
>> #include <linux/lsm_hooks.h>
>> #include <linux/magic.h>
>> +#include <linux/user_namespace.h>
>>
>> -static struct vfsmount *mount;
>> -static int mount_count;
>> +static struct vfsmount *init_securityfs_mount;
>> +static int init_securityfs_mount_count;
>> +
>> +static int securityfs_permission(struct user_namespace *mnt_userns,
>> + struct inode *inode, int mask)
>> +{
>> + int err;
>> +
>> + err = generic_permission(&init_user_ns, inode, mask);
>> + if (!err) {
>> + /*
>> + * Unless bind-mounted, deny access if current_user_ns() is not
>> + * ancestor.
>
> This comment has confused me the last few times I looked at this. I see
> now you're using "bind-mounted" as a shortcut for saying "bind mounted from
> the init_user_ns into a child_user_ns container". I do think that needs
> to be made clearer in this comment.
I rephrased the comment now.
Stefan
Powered by blists - more mailing lists