[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6306b4e5-f26d-1704-6344-354eb5387abf@linux.ibm.com>
Date: Fri, 3 Dec 2021 13:06:13 -0500
From: Stefan Berger <stefanb@...ux.ibm.com>
To: jejb@...ux.ibm.com, linux-integrity@...r.kernel.org
Cc: zohar@...ux.ibm.com, serge@...lyn.com,
christian.brauner@...ntu.com, containers@...ts.linux.dev,
dmitry.kasatkin@...il.com, ebiederm@...ssion.com,
krzysztof.struczynski@...wei.com, roberto.sassu@...wei.com,
mpeters@...hat.com, lhinds@...hat.com, lsturman@...hat.com,
puiterwi@...hat.com, jamjoom@...ibm.com,
linux-kernel@...r.kernel.org, paul@...l-moore.com, rgb@...hat.com,
linux-security-module@...r.kernel.org, jmorris@...ei.org
Subject: Re: [RFC v2 19/19] ima: Setup securityfs for IMA namespace
On 12/3/21 12:03, James Bottomley wrote:
> On Thu, 2021-12-02 at 21:31 -0500, Stefan Berger wrote:
> [...]
>> static int securityfs_init_fs_context(struct fs_context *fc)
>> {
>> + int rc;
>> +
>> + if (fc->user_ns->ima_ns->late_fs_init) {
>> + rc = fc->user_ns->ima_ns->late_fs_init(fc->user_ns);
>> + if (rc)
>> + return rc;
>> + }
>> fc->ops = &securityfs_context_ops;
>> return 0;
>> }
> I know I suggested this, but to get this to work in general, it's going
> to have to not be specific to IMA, so it's going to have to become
> something generic like a notifier chain. The other problem is it's
> only working still by accident:
I had thought about this also but the rationale was:
securityfs is compiled due to CONFIG_IMA_NS and the user namespace
exists there and that has a pointer now to ima_namespace, which can have
that callback. I assumed that other namespaced subsystems could also be
reached then via such a callback, but I don't know.
I suppose any late filesystem init callchain would have to be connected
to the user_namespace somehow?
>
>> +int ima_fs_ns_init(struct ima_namespace *ns)
>> +{
>> + ns->mount = securityfs_ns_create_mount(ns->user_ns);
> This actually triggers on the call to securityfs_init_fs_context, but
> nothing happens because the callback is null. Every subsequent use of
> fscontext will trigger this. The point of a keyed supeblock is that
> fill_super is only called once per key, that's the place we should be
> doing this. It should also probably be a blocking notifier so any
> consumer of securityfs can be namespaced by registering for this
> notifier.
What I don't like about the fill_super is that it gets called too early:
[ 67.058611] securityfs_ns_create_mount @ 102 target user_ns:
ffff95c010698c80; nr_extents: 0
[ 67.059836] securityfs_fill_super @ 47 user_ns: ffff95c010698c80;
nr_extents: 0
We are switching to the target user namespace in
securityfs_ns_create_mount. The expected nr_extents at this point is 0,
since user_ns hasn't been configured, yet. But then security_fill_super
is also called with nr_extents 0. We cannot use that, it's too early!
>
>> + if (IS_ERR(ns->mount)) {
>> + ns->mount = NULL;
>> + return -1;
>> + }
>> + ns->mount_count = 1;
> This is a bit nasty, too: we're spilling the guts of mount count
> tracking into IMA instead of encapsulating it inside securityfs.
Ok, I can make this disappear.
>
>> +
>> + /* Adjust the trigger for user namespace's early teardown of
>> dependent
>> + * namespaces. Due to the filesystem there's an additional
>> reference
>> + * to the user namespace.
>> + */
>> + ns->user_ns->refcount_teardown += 1;
>> +
>> + ns->late_fs_init = ima_fs_ns_late_init;
>> +
>> + return 0;
>> +}
> I think what should be happening is that we shouldn't so the
> simple_pin_fs, which creates the inodes, ahead of time; we should do it
> inside fill_super using a notifier, meaning it gets called once per
fill_super would only work for the init_user_ns from what I can see.
> key, creates the root dentry then triggers the notifier which
> instantiates all the namespaced entries. We can still use
> simple_pin_fs for this because there's no locking across fill_super.
> This would mean fill_super would be called the first time the
> securityfs is mounted inside the namespace.
I guess I would need to know how fill_super would work or how it could
be called late/delayed as well.
>
> If we do it this way, we can now make securityfs have its own mount and
> mount_count inside the user namespace, which it uses internally to the
> securityfs code, thus avoiding exposing them to ima or any other
> namespaced consumer.
>
> I also think we now don't need the securityfs_ns_ duplicated functions
> because the callback via the notifier chain now ensures we can use the
> namespace they were created in to distinguish between non namespaced
> and namespaced entries.
Is there then no need to pass a separate vfsmount * in anymore? Where
would the vfsmount pointer reside? For now it's in ima_namespace, but it
sounds like it should be in a more centralized place? Should it also be
connected to the user_namespace so we can pick it up using get_user_ns()?
>
> So non-namespaced consumers of securityfs would do what they do now
> (calling the securityfs_create on initialization) and namespaced
> consumers would register a callback on the notifier which would get
> called once for every namespace the securityfs gets mounted in.
>
> I also theorize if we do it with notifiers, we could have a notifier on
> kill_sb to tear down all the entires. If we do this, I think we don't
> have to pin any more.
>
> James
>
>
diff --git a/security/inode.c b/security/inode.c
index ed5f1c533776..49c9839642ed 100644
--- a/security/inode.c
+++ b/security/inode.c
@@ -44,6 +44,8 @@ static int securityfs_fill_super(struct super_block
*sb, struct fs_context *fc)
static const struct tree_descr files[] = {{""}};
int error;
+ printk(KERN_INFO "%s @ %u user_ns: %px; nr_extents: %d\n",
__func__, __LINE__, fc->user_ns, fc->user_ns->uid_map.nr_extents);
+
error = simple_fill_super(sb, SECURITYFS_MAGIC, files);
if (error)
return error;
@@ -97,6 +99,8 @@ struct vfsmount *securityfs_ns_create_mount(struct
user_namespace *user_ns)
put_user_ns(fc->user_ns);
fc->user_ns = get_user_ns(user_ns);
+ printk(KERN_INFO "%s @ %u target user_ns: %px; nr_extents:
%d\n", __func__, __LINE__, fc->user_ns, fc->user_ns->uid_map.nr_extents);
+
mnt = fc_mount(fc);
put_fs_context(fc);
return mnt;
Powered by blists - more mailing lists