[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241112-banknoten-ehebett-211d59cb101e@brauner>
Date: Tue, 12 Nov 2024 14:10:06 +0100
From: Christian Brauner <brauner@...nel.org>
To: Erin Shepherd <erin.shepherd@....eu>, Jeff Layton <jlayton@...nel.org>,
Amir Goldstein <amir73il@...il.com>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
christian@...uner.io, paul@...l-moore.com, bluca@...ian.org,
Chuck Lever <chuck.lever@...cle.com>
Subject: Re: [PATCH 0/4] pidfs: implement file handle support
On Fri, Nov 01, 2024 at 01:54:48PM +0000, Erin Shepherd wrote:
> Since the introduction of pidfs, we have had 64-bit process identifiers
> that will not be reused for the entire uptime of the system. This greatly
> facilitates process tracking in userspace.
>
> There are two limitations at present:
>
> * These identifiers are currently only exposed to processes on 64-bit
> systems. On 32-bit systems, inode space is also limited to 32 bits and
> therefore is subject to the same reuse issues.
> * There is no way to go from one of these unique identifiers to a pid or
> pidfd.
>
> Patch 1 & 2 in this stack implements fh_export for pidfs. This means
> userspace can retrieve a unique process identifier even on 32-bit systems
> via name_to_handle_at.
>
> Patch 3 & 4 in this stack implement fh_to_dentry for pidfs. This means
> userspace can convert back from a file handle to the corresponding pidfd.
> To support us going from a file handle to a pidfd, we have to store a pid
> inside the file handle. To ensure file handles are invariant and can move
> between pid namespaces, we stash a pid from the initial namespace inside
> the file handle.
>
> I'm not quite sure if stashing an initial-namespace pid inside the file
> handle is the right approach here; if not, I think that patch 1 & 2 are
> useful on their own.
Sorry for the delayed reply (I'm recovering from a lengthy illness.).
I like the idea in general. I think this is really useful. A few of my
thoughts but I need input from Amir and Jeff:
* In the last patch of the series you already implement decoding of
pidfd file handles by adding a .fh_to_dentry export_operations method.
There are a few things to consider because of how open_by_handle_at()
works.
- open_by_handle_at() needs to be restricted so it only creates pidfds
from pidfs file handles that resolve to a struct pid that is
reachable in the caller's pid namespace. In other words, it should
mirror pidfd_open().
Put another way, open_by_handle_at() must not be usable to open
arbitrary pids to prevent a container from constructing a pidfd file
handle for a process that lives outside it's pid namespace
hierarchy.
With this restriction in place open_by_handle_at() can be available
to let unprivileged processes open pidfd file handles.
Related to that, I don't think we need to make open_by_handle_at()
open arbitrary pidfd file handles via CAP_DAC_READ_SEARCH. Simply
because any process in the initial pid namespace can open any other
process via pidfd_open() anyway because pid namespaces are
hierarchical.
IOW, CAP_DAC_READ_SEARCH must not override the restriction that the
provided pidfs file handle must be reachable from the caller's pid
namespace.
- open_by_handle_at() uses may_decode_fh() to determine whether it's
possible to decode a file handle as an unprivileged user. The
current checks don't make sense for pidfs. Conceptually, I think
there don't need to place any restrictions based on global
CAP_DAC_READ_SEARCH, owning user namespace of the superblock or
mount on pidfs file handles.
The only restriction that matters is that the requested pidfs file
handle is reachable from the caller's pid namespace.
- A pidfd always has exactly a single inode and a single dentry.
There's no aliases.
- Generally, in my naive opinion, I think that decoding pidfs file
handles should be a lot simpler than decoding regular path based
file handles. Because there should be no need to verify any
ancestors, or reconnect paths. Pidfs also doesn't have directory
inodes, only regular inodes. In other words, any dentry is
acceptable.
Essentially, the only thing we need is for exportfs_decode_fh_raw()
to verify that the provided pidfs file handle is resolvable in the
caller's pid namespace. If so we're done. The challenge is how to
nicely plumb this into the code without it sticking out like a sore
thumb.
- Pidfs should not be exportable via NFS. It doesn't make sense.
Powered by blists - more mailing lists