[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAGLj2rELekFugPJ8OGH8wxFabmZ1UJBORwywpZrStKjohTF63A@mail.gmail.com>
Date: Sun, 31 Mar 2019 16:33:48 +0100
From: Jonathan Kowalski <bl0pbl33p@...il.com>
To: Linus Torvalds <torvalds@...ux-foundation.org>
Cc: Jann Horn <jannh@...gle.com>,
Joel Fernandes <joel@...lfernandes.org>,
Daniel Colascione <dancol@...gle.com>,
Christian Brauner <christian@...uner.io>,
Andrew Lutomirski <luto@...nel.org>,
David Howells <dhowells@...hat.com>,
"Serge E. Hallyn" <serge@...lyn.com>,
Linux API <linux-api@...r.kernel.org>,
Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
Arnd Bergmann <arnd@...db.de>,
"Eric W. Biederman" <ebiederm@...ssion.com>,
Konstantin Khlebnikov <khlebnikov@...dex-team.ru>,
Kees Cook <keescook@...omium.org>,
Alexey Dobriyan <adobriyan@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Michael Kerrisk-manpages <mtk.manpages@...il.com>,
"Dmitry V. Levin" <ldv@...linux.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Oleg Nesterov <oleg@...hat.com>,
Nagarathnam Muthusamy <nagarathnam.muthusamy@...cle.com>,
Aleksa Sarai <cyphar@...har.com>,
Al Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH v2 0/5] pid: add pidfd_open()
On Sun, Mar 31, 2019 at 3:59 PM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Sat, Mar 30, 2019 at 9:47 PM Jann Horn <jannh@...gle.com> wrote:
> >
> > Sure, given a pidfd_clone() syscall, as long as the parent of the
> > process is giving you a pidfd for it and you don't have to deal with
> > grandchildren created by fork() calls outside your control, that
> > works.
>
> Don't do pidfd_clone() and pidfd_wait().
>
> Both of those existing system calls already get a "flags" argument.
> Just make a WPIDFD (for waitid) and CLONE_PIDFD (for clone) bit, and
> make the existing system calls just take/return a pidfd.
clone is out of flags, so there will have to be a new system call.
I am not sure about the waitid bit. Are you suggesting it takes a
pidfd and waits using it? I was thinking if we could make the pidfd
itself pollable and readable for exit status. At pidfd_open time, you
pass the flag and only if you're a parent you get a readable instance,
if not, a pollable one for everyone (eg. for an indirect child as a
reaper), and it fails for threads.
Then, the pidfd clone2 returns can also be polled and read from.
The main pain point is, currently when I ptrace from a thread a
process, I need to use waitpid (waitid throws away ptrace critical
information), and since ptrace works on a thread by thread basis, only
the attached thread can do the waitpid. This means I cannot do
anything else from the attached thread concurrently. waitfd was
supposed to solve this (back in 2009) but it never made it in, and
clone4 from Josh Triplett did something similar (returned exit status
over the clonefd).
FreeBSD's process descriptors are also pollable (which is where all
this work was originally inspired from) and it would help with
adoption if semantics were similar. Besides that, it would help
libraries to be able to host their own set of children without
affecting the entire process's waiting logic oe mucking with the
SIGCHLD handler (you wouldn't need signals).
>
> Side note: we could (should?) also make the default maxpid just be
> larger. It needs to fit in an 'int', but MAXINT instead of 65535 would
> likely alreadt make a lot of these attacks harder.
>
> There was some really old legacy reason why we actually limited it to
> 65535 originally. It was old and crufty even back when..
>
> Linus
>
> Linus
Powered by blists - more mailing lists