lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 7 Dec 2023 10:52:38 -0700
From:   Tycho Andersen <tycho@...ho.pizza>
To:     Christian Brauner <brauner@...nel.org>
Cc:     Oleg Nesterov <oleg@...hat.com>,
        "Eric W . Biederman" <ebiederm@...ssion.com>,
        linux-kernel@...r.kernel.org, linux-api@...r.kernel.org,
        Tycho Andersen <tandersen@...flix.com>,
        Jan Kara <jack@...e.cz>, linux-fsdevel@...r.kernel.org,
        Joel Fernandes <joel@...lfernandes.org>
Subject: Re: [RFC 1/3] pidfd: allow pidfd_open() on non-thread-group leaders

On Thu, Dec 07, 2023 at 06:21:18PM +0100, Christian Brauner wrote:
> [Cc fsdevel & Jan because we had some discussions about fanotify
> returning non-thread-group pidfds. That's just for awareness or in case
> this might need special handling.]
> 
> On Thu, Nov 30, 2023 at 09:39:44AM -0700, Tycho Andersen wrote:
> > From: Tycho Andersen <tandersen@...flix.com>
> > 
> > We are using the pidfd family of syscalls with the seccomp userspace
> > notifier. When some thread triggers a seccomp notification, we want to do
> > some things to its context (munge fd tables via pidfd_getfd(), maybe write
> > to its memory, etc.). However, threads created with ~CLONE_FILES or
> > ~CLONE_VM mean that we can't use the pidfd family of syscalls for this
> > purpose, since their fd table or mm are distinct from the thread group
> > leader's. In this patch, we relax this restriction for pidfd_open().
> > 
> > In order to avoid dangling poll() users we need to notify pidfd waiters
> > when individual threads die, but once we do that all the other machinery
> > seems to work ok viz. the tests. But I suppose there are more cases than
> > just this one.
> > 
> > Another weirdness is the open-coding of this vs. exporting using
> > do_notify_pidfd(). This particular location is after __exit_signal() is
> > called, which does __unhash_process() which kills ->thread_pid, so we need
> > to use the copy we have locally, vs do_notify_pid() which accesses it via
> > task_pid(). Maybe this suggests that the notification should live somewhere
> > in __exit_signals()? I just put it here because I saw we were already
> > testing if this task was the leader.
> > 
> > Signed-off-by: Tycho Andersen <tandersen@...flix.com>
> > ---
> 
> So we've always said that if there's a use-case for this then we're
> willing to support it. And I think that stance hasn't changed. I know
> that others have expressed interest in this as well.
> 
> So currently the series only enables pidfds for threads to be created
> and allows notifications for threads. But all places that currently make
> use of pidfds refuse non-thread-group leaders. We can certainly proceed
> with a patch series that only enables creation and exit notification but
> we should also consider unlocking additional functionality:
> 
> * audit of all callers that use pidfd_get_task()
> 
>   (1) process_madvise()
>   (2) process_mrlease()
> 
>   I expect that both can handle threads just fine but we'd need an Ack
>   from mm people.
> 
> * pidfd_prepare() is used to create pidfds for:
> 
>   (1) CLONE_PIDFD via clone() and clone3()
>   (2) SCM_PIDFD and SO_PEERPIDFD
>   (3) fanotify
>   
>   (1) is what this series here is about.
> 
>   For (2) we need to check whether fanotify would be ok to handle pidfds
>   for threads. It might be fine but Jan will probably know more.
> 
>   For (3) the change doesn't matter because SCM_CREDS always use the
>   thread-group leader. So even if we allowed the creation of pidfds for
>   threads it wouldn't matter.
> * audit all callers of pidfd_pid() whether they could simply be switched
>   to handle individual threads:
> 
>   (1) setns() handles threads just fine so this is safe to allow.
>   (2) pidfd_getfd() I would like to keep restricted and essentially
>       freeze new features for it.
> 
>       I'm not happy that we did didn't just implement it as an ioctl to
>       the seccomp notifier. And I wouldn't oppose a patch that would add
>       that functionality to the seccomp notifier itself. But that's a
>       separate topic.
>   (3) pidfd_send_signal(). I think that one is the most interesting on
>       to allow signaling individual threads. I'm not sure that you need
>       to do this right now in this patch but we need to think about what
>       we want to do there.

This all sounds reasonable to me, I can take a look as time permits.

pidfd_send_signal() at the very least would have been useful while
writing these tests.

Tycho

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ