[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241113-entnimmt-weintrauben-3b0b4a1a18b7@brauner>
Date: Wed, 13 Nov 2024 14:26:26 +0100
From: Christian Brauner <brauner@...nel.org>
To: Erin Shepherd <erin.shepherd@....eu>
Cc: linux-kernel@...r.kernel.org, linux-fsdevel@...r.kernel.org,
christian@...uner.io, paul@...l-moore.com, bluca@...ian.org
Subject: Re: [PATCH 4/4] pidfs: implement fh_to_dentry
On Wed, Nov 13, 2024 at 02:06:56PM +0100, Erin Shepherd wrote:
> On 13/11/2024 13:09, Christian Brauner wrote:
>
> > Hm, a pidfd comes in two flavours:
> >
> > (1) thread-group leader pidfd: pidfd_open(<pid>, 0)
> > (2) thread pidfd: pidfd_open(<pid>, PIDFD_THREAD)
> >
> > In your current scheme fid->pid = pid_nr(pid) means that you always
> > encode a pidfs file handle for a thread pidfd no matter if the provided
> > pidfd was a thread-group leader pidfd or a thread pidfd. This is very
> > likely wrong as it means users that use a thread-group pidfd get a
> > thread-specific pid back.
> >
> > I think we need to encode (1) and (2) in the pidfs file handle so users
> > always get back the correct type of pidfd.
> >
> > That very likely means name_to_handle_at() needs to encode this into the
> > pidfs file handle.
>
> I guess a question here is whether a pidfd handle encodes a handle to a pid
> in a specific mode, or just to a pid in general? The thought had occurred
> to me while I was working on this initially, but I felt like perhaps treating
> it as a property of the file descriptor in general was better.
>
> Currently open_by_handle_at always returns a thread-group pidfd (since
> PIDFD_THREAD) isn't set, regardless of what type of pidfd you passed to
> name_to_handle_at. I had thought that PIDFD_THREAD/O_EXCL would have been
I don't think you're returning a thread-groupd pidfd from
open_by_handle_at() in your scheme. After all you're encoding the tid in
pid_nr() so you'll always find the struct pid for the thread afaict. If
I'm wrong could you please explain how you think this works? I might
just be missing something obvious.
> passed through to f->f_flags on the restored pidfd, but upon checking I see that
> it gets filtered out in do_dentry_open.
It does, but note that __pidfd_prepare() raises it explicitly on the
file afterwards. So it works fine.
>
> I feel like leaving it up to the caller of open_by_handle_at might be better
> (because they are probably better informed about whether they want poll() to
> inform them of thread or process exit) but I could lean either way.
So in order to decode a pidfs file handle you want the caller to have to
specify O_EXCL in the flags argument of open_by_handle_at()? Is that
your idea?
>
> >> +static struct dentry *pidfs_fh_to_dentry(struct super_block *sb,
> >> + struct fid *gen_fid,
> >> + int fh_len, int fh_type)
> >> +{
> >> + int ret;
> >> + struct path path;
> >> + struct pidfd_fid *fid = (struct pidfd_fid *)gen_fid;
> >> + struct pid *pid;
> >> +
> >> + if (fh_type != FILEID_INO64_GEN || fh_len < PIDFD_FID_LEN)
> >> + return NULL;
> >> +
> >> + pid = find_get_pid_ns(fid->pid, &init_pid_ns);
> >> + if (!pid || pid->ino != fid->ino || pid_vnr(pid) == 0) {
> >> + put_pid(pid);
> >> + return NULL;
> >> + }
> > I think we can avoid the premature reference bump and do:
> >
> > scoped_guard(rcu) {
> > struct pid *pid;
> >
> > pid = find_pid_ns(fid->pid, &init_pid_ns);
> > if (!pid)
> > return NULL;
> >
> > /* Did the pid get recycled? */
> > if (pid->ino != fid->ino)
> > return NULL;
> >
> > /* Must be resolvable in the caller's pid namespace. */
> > if (pid_vnr(pid) == 0)
> > return NULL;
> >
> > /* Ok, this is the pid we want. */
> > get_pid(pid);
> > }
>
> I can go with that if preferred. I was worried a bit about making the RCU
> critical section too large, but of course I'm sure there are much larger
> sections inside the kernel.
This is perfectly fine. Don't worry about it.
>
> >> +
> >> + ret = path_from_stashed(&pid->stashed, pidfs_mnt, pid, &path);
> >> + if (ret < 0)
> >> + return ERR_PTR(ret);
> >> +
> >> + mntput(path.mnt);
> >> + return path.dentry;
> >> }
>
> Similarly here i should probably refactor this into dentry_from_stashed in
> order to avoid a needless bump-then-drop of path.mnt's reference count
No, what you have now is fine. I wouldn't add a specific helper for
this. In contrast to the pid the pidfs mount never goes away.
Powered by blists - more mailing lists