linux-kernel - Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for Android

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20190315124348.528ecd87@gandalf.local.home>
Date:   Fri, 15 Mar 2019 12:43:48 -0400
From:   Steven Rostedt <rostedt@...dmis.org>
To:     Daniel Colascione <dancol@...gle.com>
Cc:     Sultan Alsawaf <sultan@...neltoast.com>,
        Joel Fernandes <joel@...lfernandes.org>,
        Tim Murray <timmurray@...gle.com>,
        Michal Hocko <mhocko@...nel.org>,
        Suren Baghdasaryan <surenb@...gle.com>,
        Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
        Arve Hjønnevåg 
        <arve@...roid.com>, Todd Kjos <tkjos@...roid.com>,
        Martijn Coenen <maco@...roid.com>,
        Christian Brauner <christian@...uner.io>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        "open list:ANDROID DRIVERS" <devel@...verdev.osuosl.org>,
        linux-mm <linux-mm@...ck.org>,
        kernel-team <kernel-team@...roid.com>
Subject: Re: [RFC] simple_lmk: Introduce Simple Low Memory Killer for
 Android

On Thu, 14 Mar 2019 21:36:43 -0700
Daniel Colascione <dancol@...gle.com> wrote:

> On Thu, Mar 14, 2019 at 8:16 PM Steven Rostedt <rostedt@...dmis.org> wrote:
> >
> > On Thu, 14 Mar 2019 13:49:11 -0700
> > Sultan Alsawaf <sultan@...neltoast.com> wrote:
> >  
> > > Perhaps I'm missing something, but if you want to know when a process has died
> > > after sending a SIGKILL to it, then why not just make the SIGKILL optionally
> > > block until the process has died completely? It'd be rather trivial to just
> > > store a pointer to an onstack completion inside the victim process' task_struct,
> > > and then complete it in free_task().  
> >
> > How would you implement such a method in userspace? kill() doesn't take
> > any parameters but the pid of the process you want to send a signal to,
> > and the signal to send. This would require a new system call, and be
> > quite a bit of work.  
> 
> That's what the pidfd work is for. Please read the original threads
> about the motivation and design of that facility.

I wasn't Cc'd on the original work, so I haven't read them.

> 
> > If you can solve this with an ebpf program, I
> > strongly suggest you do that instead.  
> 



> We do want killed processes to die promptly. That's why I support
> boosting a process's priority somehow when lmkd is about to kill it.
> The precise way in which we do that --- involving not only actual
> priority, but scheduler knobs, cgroup assignment, core affinity, and
> so on --- is a complex topic best left to userspace. lmkd already has
> all the knobs it needs to implement whatever priority boosting policy
> it wants.
> 
> Hell, once we add a pidfd_wait --- which I plan to work on, assuming
> nobody beats me to it, after pidfd_send_signal lands --- you can
> imagine a general-purpose priority inheritance mechanism expediting
> process death when a high-priority process waits on a pidfd_wait
> handle for a condemned process. You know you're on the right track
> design-wise when you start seeing this kind of elegant constructive
> interference between seemingly-unrelated features. What we don't need
> is some kind of blocking SIGKILL alternative or backdoor event
> delivery system.
> 
> We definitely don't want to have to wait for a process's parent to
> reap it. Instead, we want to wait for it to become a zombie. That's
> why I designed my original exithand patch to fire death notification
> upon transition to the zombie state, not upon process table removal,
> and I expect pidfd_wait (or whatever we call it) to act the same way.
> 
> In any case, there's a clear path forward here --- general-purpose,
> cheap, and elegant --- and we should just focus on doing that instead
> of more complex proposals with few advantages.

If you add new pidfd systemcalls then making a new way to send a signal
and block till it does die or whatever is more acceptable than adding a
new signal that changes the semantics of sending signals, which is what
I was against.

I do agree with Joel about bloating task_struct too. If anything, have
a wait queue you add, where you can allocate a descriptor with the task
dieing and task killing, and just search this queue on dying. We could
add a TIF flag to the task as well to let the exiting of this task know
it should do such an operation.

-- Steve