[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAEXW_YRFUNqmaeVB9u2YbyLXHZhskYiP6_Ae1vqXqu+swtipVA@mail.gmail.com>
Date: Fri, 15 Mar 2019 18:22:38 -0700
From: Joel Fernandes <joel@...lfernandes.org>
To: Christian Brauner <christian@...uner.io>
Cc: Linus Torvalds <torvalds@...ux-foundation.org>,
Thomas Glexiner <tglx@...utronix.de>,
LKML <linux-kernel@...r.kernel.org>,
"maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
Arnd Bergmann <arnd@...db.de>
Subject: Re: [GIT PULL RESEND] pidfd changes for v5.1-rc1
On Tue, Mar 12, 2019 at 6:53 AM Christian Brauner <christian@...uner.io> wrote:
>
> Hi Linus,
>
> This is a resend of the pull request for the pidfd_send_signal() syscall
> which I sent last Tuesday. I'm not sure whether you just wanted to take a
> closer look.
>
> The following changes since commit f17b5f06cb92ef2250513a1e154c47b78df07d40:
>
> Linux 5.0-rc4 (2019-01-27 15:18:05 -0800)
>
> are available in the Git repository at:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git tags/pidfd-v5.1-rc1
>
> The patchset introduces the ability to use file descriptors from proc/<pid>
> as stable handles on struct pid. Even if a pid is recycled the handle will
> not change. For a start these fds can be used to send signals to the
> processes they refer to.
Joel from the Android team here. This will solve a long standing issue we
have with Android's low memory killer daemon (lmkd) where the killing of
a PID is racy with the traditional signal delivery methods. With this new API,
we can kill things correctly in a race free way. I hope this will get merged
soon and I look forward to further developing on top of this (such as
for support knowing when something was killed and waiting for it reliably -
right now we have a very suboptimal 100ms periodic polling loop to
check for process death, whichslows down how fast we can kill processes to
reclaim their memory).
thanks,
- Joel
>
> With the ability to use /proc/<pid> fds as stable handles on struct pid we
> can fix a long-standing issue where after a process has exited its pid can
> be reused by another process. If a caller sends a signal to a reused pid it
> will end up signaling the wrong process.
> With this patchset we enable a variety of use cases. One obvious example is
> that we can now safely delegate an important part of process management -
> sending signals - to processes other than the parent of a given process by
> sending file descriptors around via scm rights and not fearing that the
> given process will have been recycled in the meantime.
> It also allows for easy testing whether a given process is still alive or
> not by sending signal 0 to a pidfd which is quite handy.
> There has been some interest in this feature e.g. from systems management
> (systemd, glibc) and container managers. I have requested and gotten
> comments from glibc to make sure that this syscall is suitable for their
> needs as well. In the future I expect it to take on most other pid-based
> signal syscalls. But such features are left for the future once they are
> needed.
>
> The patchset has been sitting in linux-next for quite a while and has
> not caused any issues. It comes with selftests which verify basic
> functionality and also test that a recycled pid cannot be signaled via a
> pidfd.
>
> Jon has written about a prior version of this patchset. It should cover the
> basic functionality since not a lot has changed since then:
>
> https://lwn.net/Articles/773459/
>
> The commit message for the syscall itself is extensively documenting the
> syscall, including it's functionality and extensibility.
>
> /* Merge conflict and sycall number coordination */
> Please note, there will be a merge conflict between the Jens' io_uring
> patch set in the block tree and this tree. To minimize its impact Arnd
> worked with Jens and me to coordinate syscall numbers in advance.
> pidfd_send_signal() takes 424 and Jens' patchset took 425 to 427.
>
> /* Separate tree on kernel.org */
> At the beginning of last merge cycle it was suggested to move this patchset
> into a separate tree on kernel.org as there will be more work coming that
> will be extending the use of file descriptors for processes. The tree was
> announced in January:
>
> https://lore.kernel.org/lkml/20190108234722.bojj5bqowlutymnt@brauner.io/
>
> The pidfd tree is located on kernel.org
>
> https://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux.git/
>
> and it's for-next branch is already tracked by Stephen in linux-next since
> the beginning of the 5.0 development cycle. I'm prepared to deal with any
> fallouts coming from this work going forward.
>
> The only thing that has changed recently in these patches was the addition
> of two more Acked-by/Reviewed-by from David Howells and tglx after the
> last round of reviews.
>
> Please consider pulling these changes from the signed pidfd-v5.1-rc1 tag.
>
> Thanks!
> Christian
>
> ----------------------------------------------------------------
> pidfd patches for v5.1-rc1
>
> ----------------------------------------------------------------
> Christian Brauner (2):
> signal: add pidfd_send_signal() syscall
> selftests: add tests for pidfd_send_signal()
>
> arch/x86/entry/syscalls/syscall_32.tbl | 1 +
> arch/x86/entry/syscalls/syscall_64.tbl | 1 +
> fs/proc/base.c | 9 +
> include/linux/proc_fs.h | 6 +
> include/linux/syscalls.h | 3 +
> include/uapi/asm-generic/unistd.h | 4 +-
> kernel/signal.c | 133 +++++++++-
> kernel/sys_ni.c | 1 +
> tools/testing/selftests/Makefile | 1 +
> tools/testing/selftests/pidfd/Makefile | 6 +
> tools/testing/selftests/pidfd/pidfd_test.c | 381 +++++++++++++++++++++++++++++
> 11 files changed, 539 insertions(+), 7 deletions(-)
> create mode 100644 tools/testing/selftests/pidfd/Makefile
> create mode 100644 tools/testing/selftests/pidfd/pidfd_test.c
Powered by blists - more mailing lists