[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20260128-twmount-v1-1-b1d446362da9@kernel.org>
Date: Wed, 28 Jan 2026 12:47:14 -0500
From: Jeff Layton <jlayton@...nel.org>
To: Alexander Viro <viro@...iv.linux.org.uk>,
Christian Brauner <brauner@...nel.org>, Jan Kara <jack@...e.cz>,
"Seth Forshee (DigitalOcean)" <sforshee@...nel.org>,
Alexander Mikhalitsyn <alexander@...alicyn.com>
Cc: linux-fsdevel@...r.kernel.org, linux-kernel@...r.kernel.org,
Jeff Layton <jlayton@...nel.org>
Subject: [PATCH RFC] vfs: allow mounting inside a container without
FS_USERNS_MOUNT by root
Meta (and some other places) have an unusual process for doing an NFS
mount inside an unprivilged container. They do the fsopen() and
inside the container, and then pass it to a privileged daemon running
outside that container via unix socket, that then does the mount.
Commit e1c5ae59c0f22 ("fs: don't allow non-init s_user_ns for
filesystems without FS_USERNS_MOUNT") broke this scheme, as the fc->user_ns is
not init_user_ns, even though the daemon doing the mount has CAP_SYS_ADMIN.
Add a check for CAP_SYS_ADMIN to get it working again.
Fixes: e1c5ae59c0f22 ("fs: don't allow non-init s_user_ns for filesystems without FS_USERNS_MOUNT")
Signed-off-by: Jeff Layton <jlayton@...nel.org>
---
We've needed to revert e1c5ae59c0f22 for the last year or so in order to
keep NFS mounts inside containers working. Does this approach seem sane,
or are there valid concerns with allowing this that I'm not aware of?
This is not well tested yet, hence the RFC.
---
fs/super.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/super.c b/fs/super.c
index 3d85265d14001d51524dbaec0778af8f12c048ac..d06f3e5765921a2ab341827a95dcd663c38cb594 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -738,12 +738,15 @@ struct super_block *sget_fc(struct fs_context *fc,
int err;
/*
- * Never allow s_user_ns != &init_user_ns when FS_USERNS_MOUNT is
+ * Don't allow s_user_ns != &init_user_ns when FS_USERNS_MOUNT is
* not set, as the filesystem is likely unprepared to handle it.
* This can happen when fsconfig() is called from init_user_ns with
- * an fs_fd opened in another user namespace.
+ * an fs_fd opened in another user namespace. If the user has
+ * CAP_SYS_ADMIN in the init_user_ns however, allow it.
*/
- if (user_ns != &init_user_ns && !(fc->fs_type->fs_flags & FS_USERNS_MOUNT)) {
+ if (user_ns != &init_user_ns &&
+ !(fc->fs_type->fs_flags & FS_USERNS_MOUNT) &&
+ !capable(CAP_SYS_ADMIN)) {
errorfc(fc, "VFS: Mounting from non-initial user namespace is not allowed");
return ERR_PTR(-EPERM);
}
---
base-commit: 1f97d9dcf53649c41c33227b345a36902cbb08ad
change-id: 20260128-twmount-c29299e88464
Best regards,
--
Jeff Layton <jlayton@...nel.org>
Powered by blists - more mailing lists