[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAG48ez1vd4Yhd3DqHVjTWM-N0MaNnX9n8MNV7MEyU5m3XDu+kQ@mail.gmail.com>
Date: Wed, 24 Jul 2019 21:07:54 +0200
From: Jann Horn <jannh@...gle.com>
To: Christian Brauner <christian@...uner.io>
Cc: kernel list <linux-kernel@...r.kernel.org>,
Oleg Nesterov <oleg@...hat.com>, Arnd Bergmann <arnd@...db.de>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Kees Cook <keescook@...omium.org>,
"Joel Fernandes (Google)" <joel@...lfernandes.org>,
Thomas Gleixner <tglx@...utronix.de>,
Tejun Heo <tj@...nel.org>, David Howells <dhowells@...hat.com>,
Andy Lutomirski <luto@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Aleksa Sarai <cyphar@...har.com>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Al Viro <viro@...iv.linux.org.uk>,
kernel-team <kernel-team@...roid.com>,
Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Linux API <linux-api@...r.kernel.org>
Subject: Re: [PATCH 4/5] pidfd: add CLONE_WAIT_PID
On Wed, Jul 24, 2019 at 8:27 PM Christian Brauner <christian@...uner.io> wrote:
> On July 24, 2019 8:14:26 PM GMT+02:00, Jann Horn <jannh@...gle.com> wrote:
> >On Wed, Jul 24, 2019 at 4:48 PM Christian Brauner
> ><christian@...uner.io> wrote:
> >> If CLONE_WAIT_PID is set the newly created process will not be
> >> considered by process wait requests that wait generically on children
> >> such as:
> >>
> >> syscall(__NR_wait4, -1, wstatus, options, rusage)
> >> syscall(__NR_waitpid, -1, wstatus, options)
> >> syscall(__NR_waitid, P_ALL, -1, siginfo, options, rusage)
> >> syscall(__NR_waitid, P_PGID, -1, siginfo, options, rusage)
> >> syscall(__NR_waitpid, -pid, wstatus, options)
> >> syscall(__NR_wait4, -pid, wstatus, options, rusage)
> >>
> >> A process created with CLONE_WAIT_PID can only be waited upon with a
> >> focussed wait call. This ensures that processes can be reaped even if
> >> all file descriptors referring to it are closed.
> >[...]
> >> diff --git a/kernel/fork.c b/kernel/fork.c
> >> index baaff6570517..a067f3876e2e 100644
> >> --- a/kernel/fork.c
> >> +++ b/kernel/fork.c
> >> @@ -1910,6 +1910,8 @@ static __latent_entropy struct task_struct
> >*copy_process(
> >> delayacct_tsk_init(p); /* Must remain after
> >dup_task_struct() */
> >> p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE);
> >> p->flags |= PF_FORKNOEXEC;
> >> + if (clone_flags & CLONE_WAIT_PID)
> >> + p->flags |= PF_WAIT_PID;
> >> INIT_LIST_HEAD(&p->children);
> >> INIT_LIST_HEAD(&p->sibling);
> >> rcu_copy_process(p);
> >
> >This means that if a process with PF_WAIT_PID forks, the child
> >inherits the flag, right? That seems unintended? You might have to add
> >something like "if (clone_flags & CLONE_THREAD == 0) p->flags &=
> >~PF_WAIT_PID;" before this. (I think threads do have to inherit the
> >flag so that the case where a non-leader thread of the child goes
> >through execve and steals the leader's identity is handled properly.)
> >Or you could cram it somewhere into signal_struct instead of on the
> >task - that might be a more logical place for it?
>
> Hm, CLONE_WAIT_PID is only useable with CLONE_PIDFD which in turn is
> not useable with CLONE_THREAD.
> But we should probably make that explicit for CLONE_WAIT_PID too.
To clarify:
This code looks buggy to me because p->flags is inherited from the
parent, with the exception of flags that are explicitly stripped out.
Since PF_WAIT_PID is not stripped out, this means that if task A
creates a child B with clone(CLONE_WAIT_PID), and then task B uses
fork() to create a child C, then B will not be able to use
wait(&status) to wait for C since C inherited PF_WAIT_PID from B.
The obvious way to fix that would be to always strip out PF_WAIT_PID;
but that would also be wrong, because if task B creates a thread C,
and then C calls execve(), the task_struct of B goes away and B's TGID
is taken over by C. When C eventually exits, it should still obey the
CLONE_WAIT_PID (since to A, it's all the same process). Therefore, if
p->flags is used to track whether the task was created with
CLONE_WAIT_PID, PF_WAIT_PID must be inherited if CLONE_THREAD is set.
So:
diff --git a/kernel/fork.c b/kernel/fork.c
index d8ae0f1b4148..b32e1e9a6c9c 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -1902,6 +1902,10 @@ static __latent_entropy struct task_struct *copy_process(
delayacct_tsk_init(p); /* Must remain after dup_task_struct() */
p->flags &= ~(PF_SUPERPRIV | PF_WQ_WORKER | PF_IDLE);
p->flags |= PF_FORKNOEXEC;
+ if (!(clone_flags & CLONE_THREAD))
+ p->flags &= ~PF_PF_WAIT_PID;
+ if (clone_flags & CLONE_WAIT_PID)
+ p->flags |= PF_PF_WAIT_PID;
INIT_LIST_HEAD(&p->children);
INIT_LIST_HEAD(&p->sibling);
rcu_copy_process(p);
An alternative would be to not use p->flags at all, but instead make
this a property of the signal_struct - since the property is shared by
all threads, that might make more sense?
Powered by blists - more mailing lists