linux-kernel - Re: [RFC PATCH] Minimal non-child process exit notification support

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAKOZuetNC8Z4a5HAZkm2D4RMrTOCWj71rLJXOOtuoFaF3gGEpA@mail.gmail.com>
Date:   Tue, 30 Oct 2018 08:59:25 +0000
From:   Daniel Colascione <dancol@...gle.com>
To:     Joel Fernandes <joelaf@...gle.com>
Cc:     LKML <linux-kernel@...r.kernel.org>,
        Tim Murray <timmurray@...gle.com>
Subject: Re: [RFC PATCH] Minimal non-child process exit notification support

On Tue, Oct 30, 2018 at 3:06 AM, Joel Fernandes <joelaf@...gle.com> wrote:
> On Mon, Oct 29, 2018 at 1:01 PM Daniel Colascione <dancol@...gle.com> wrote:
>>
>> Thanks for taking a look.
>>
>> On Mon, Oct 29, 2018 at 7:45 PM, Joel Fernandes <joelaf@...gle.com> wrote:
>> >
>> > On Mon, Oct 29, 2018 at 10:53 AM Daniel Colascione <dancol@...gle.com> wrote:
>> > >
>> > > This patch adds a new file under /proc/pid, /proc/pid/exithand.
>> > > Attempting to read from an exithand file will block until the
>> > > corresponding process exits, at which point the read will successfully
>> > > complete with EOF.  The file descriptor supports both blocking
>> > > operations and poll(2). It's intended to be a minimal interface for
>> > > allowing a program to wait for the exit of a process that is not one
>> > > of its children.
>> > >
>> > > Why might we want this interface? Android's lmkd kills processes in
>> > > order to free memory in response to various memory pressure
>> > > signals. It's desirable to wait until a killed process actually exits
>> > > before moving on (if needed) to killing the next process. Since the
>> > > processes that lmkd kills are not lmkd's children, lmkd currently
>> > > lacks a way to wait for a proces to actually die after being sent
>> > > SIGKILL; today, lmkd resorts to polling the proc filesystem pid
>> >
>> > Any idea why it needs to wait and then send SIGKILL? Why not do
>> > SIGKILL and look for errno == ESRCH in a loop with a delay.
>>
>> I want to get polling loops out of the system. Polling loops are bad
>> for wakeup attribution, bad for power, bad for priority inheritance,
>> and bad for latency. There's no right answer to the question "How long
>> should I wait before checking $CONDITION again?". If we can have an
>> explicit waitqueue interface to something, we should. Besides, PID
>> polling is vulnerable to PID reuse, whereas this mechanism (just like
>> anything based on struct pid) is immune to it.
>
> The argument sounds Ok to me. I would also more details in the commit
> message about the alternate methods to do this (such as kill polling
> or ptrace) and why they don't work well etc so no one asks any
> questions. Like maybe under a "other ways to do this" section. A bit
> of googling also showed a netlink way of doing it without polling
> (though I don't look into that much and wouldn't be surprised if its
> more complicated)

Thanks for taking a look. I'll add to the commit message.

Re: netlink isn't enabled everywhere and is subject to lossy buffy
overruns, AIUI. You could also monitor process exit by setting up
ftrace and watching events, or by installing BPF that watched for
process exit and sent a perf event. :-) All of these interfaces feel
like abusing a "monitoring" API for controlling system operations, and
this kind of abuse tends to have ugly failure modes. I'm looking for
something a bit more explicit and robust.

>
> Also I guess when you send a patch, it'd be good to pass
> "--cc-cmd='./scripts/get_maintainer.pl" to git-send-email so it
> automatically CCs the maintainers who maintain this.

Thanks for the tip!