linux-kernel - Re: [RFC PATCH v2] Minimal non-child process exit notification support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20181101070036.l24c2p432ohuwmqf@yavin>
Date:   Thu, 1 Nov 2018 18:00:36 +1100
From:   Aleksa Sarai <cyphar@...har.com>
To:     Daniel Colascione <dancol@...gle.com>
Cc:     linux-kernel@...r.kernel.org, timmurray@...gle.com,
        joelaf@...gle.com
Subject: Re: [RFC PATCH v2] Minimal non-child process exit notification
 support

On 2018-10-29, Daniel Colascione <dancol@...gle.com> wrote:
> This patch adds a new file under /proc/pid, /proc/pid/exithand.
> Attempting to read from an exithand file will block until the
> corresponding process exits, at which point the read will successfully
> complete with EOF.  The file descriptor supports both blocking
> operations and poll(2). It's intended to be a minimal interface for
> allowing a program to wait for the exit of a process that is not one
> of its children.
> 
> Why might we want this interface? Android's lmkd kills processes in
> order to free memory in response to various memory pressure
> signals. It's desirable to wait until a killed process actually exits
> before moving on (if needed) to killing the next process. Since the
> processes that lmkd kills are not lmkd's children, lmkd currently
> lacks a way to wait for a process to actually die after being sent
> SIGKILL; today, lmkd resorts to polling the proc filesystem pid
> entry. This interface allow lmkd to give up polling and instead block
> and wait for process death.

I agree with the need for this interface (with a few caveats), but there
are a few points I'd like to make:

 * I don't think that making a new procfile is necessary. When you open
   /proc/$pid you already have a handle for the underlying process, and
   you can already poll to check whether the process has died (fstatat
   fails for instance). What if we just used an inotify event to tell
   userspace that the process has died -- to avoid userspace doing a
   poll loop?

 * There is a fairly old interface called the proc_connector which gives
   you global fork+exec+exit events (similar to kevents from FreeBSD
   though much less full-featured). I was working on some patches to
   extend proc_connector so that it could be used inside containers as
   well as unprivileged users. This would be another way we could
   implement this.

I'm really not a huge fan of the "blocking read" semantic (though if we
have to have it, can we at least provide as much information as you get
from proc_connector -- such as the exit status?). Also maybe we should
integrate this into the exit machinery instead of this loop...

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

Download attachment "signature.asc" of type "application/pgp-signature" (834 bytes)