[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <c1ba0f8d-6b3c-4c2b-863c-2ce374df723c@infradead.org>
Date: Tue, 5 Aug 2025 17:19:45 -0700
From: Randy Dunlap <rdunlap@...radead.org>
To: Aleksa Sarai <cyphar@...har.com>, Alexander Viro
<viro@...iv.linux.org.uk>, Christian Brauner <brauner@...nel.org>,
Jan Kara <jack@...e.cz>, Jonathan Corbet <corbet@....net>,
Shuah Khan <shuah@...nel.org>
Cc: Andy Lutomirski <luto@...capital.net>, linux-kernel@...r.kernel.org,
linux-fsdevel@...r.kernel.org, linux-api@...r.kernel.org,
linux-doc@...r.kernel.org, linux-kselftest@...r.kernel.org
Subject: Re: [PATCH v4 2/4] procfs: add "pidns" mount option
Hi,
On 8/4/25 10:45 PM, Aleksa Sarai wrote:
> Since the introduction of pid namespaces, their interaction with procfs
> has been entirely implicit in ways that require a lot of dancing around
> by programs that need to construct sandboxes with different PID
> namespaces.
>
> Being able to explicitly specify the pid namespace to use when
> constructing a procfs super block will allow programs to no longer need
> to fork off a process which does then does unshare(2) / setns(2) and
> forks again in order to construct a procfs in a pidns.
>
> So, provide a "pidns" mount option which allows such users to just
> explicitly state which pid namespace they want that procfs instance to
> use. This interface can be used with fsconfig(2) either with a file
> descriptor or a path:
>
> fsconfig(procfd, FSCONFIG_SET_FD, "pidns", NULL, nsfd);
> fsconfig(procfd, FSCONFIG_SET_STRING, "pidns", "/proc/self/ns/pid", 0);
>
> or with classic mount(2) / mount(8):
>
> // mount -t proc -o pidns=/proc/self/ns/pid proc /tmp/proc
> mount("proc", "/tmp/proc", "proc", MS_..., "pidns=/proc/self/ns/pid");
>
> As this new API is effectively shorthand for setns(2) followed by
> mount(2), the permission model for this mirrors pidns_install() to avoid
> opening up new attack surfaces by loosening the existing permission
> model.
>
> In order to avoid having to RCU-protect all users of proc_pid_ns() (to
> avoid UAFs), attempting to reconfigure an existing procfs instance's pid
> namespace will error out with -EBUSY. Creating new procfs instances is
> quite cheap, so this should not be an impediment to most users, and lets
> us avoid a lot of churn in fs/proc/* for a feature that it seems
> unlikely userspace would use.
>
> Signed-off-by: Aleksa Sarai <cyphar@...har.com>
> ---
> Documentation/filesystems/proc.rst | 8 ++++
> fs/proc/root.c | 98 +++++++++++++++++++++++++++++++++++---
> 2 files changed, 100 insertions(+), 6 deletions(-)
>
> diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst
> index 5236cb52e357..5a157dadea0b 100644
> --- a/Documentation/filesystems/proc.rst
> +++ b/Documentation/filesystems/proc.rst
> @@ -2360,6 +2360,7 @@ The following mount options are supported:
> hidepid= Set /proc/<pid>/ access mode.
> gid= Set the group authorized to learn processes information.
> subset= Show only the specified subset of procfs.
> + pidns= Specify a the namespace used by this procfs.
drop ^^ a
> ========= ========================================================
>
> hidepid=off or hidepid=0 means classic mode - everybody may access all
> @@ -2392,6 +2393,13 @@ information about processes information, just add identd to this group.
> subset=pid hides all top level files and directories in the procfs that
> are not related to tasks.
>
> +pidns= specifies a pid namespace (either as a string path to something like
> +`/proc/$pid/ns/pid`, or a file descriptor when using `FSCONFIG_SET_FD`) that
> +will be used by the procfs instance when translating pids. By default, procfs
> +will use the calling process's active pid namespace. Note that the pid
> +namespace of an existing procfs instance cannot be modified (attempting to do
> +so will give an `-EBUSY` error).
> +
> Chapter 5: Filesystem behavior
> ==============================
>
--
~Randy
Powered by blists - more mailing lists