[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <1f8d300b-9a8b-de09-6d5d-6a9c20c66d24@sony.com>
Date: Wed, 21 Apr 2021 18:46:13 +0000
From: <Peter.Enderborg@...y.com>
To: <shakeelb@...gle.com>
CC: <hannes@...xchg.org>, <guro@...com>, <mhocko@...nel.org>,
<linux-mm@...ck.org>, <akpm@...ux-foundation.org>,
<cgroups@...r.kernel.org>, <rientjes@...gle.com>,
<linux-kernel@...r.kernel.org>, <surenb@...gle.com>,
<gthelen@...gle.com>, <dragoss@...gle.com>,
<padmapriyad@...gle.com>
Subject: Re: [RFC] memory reserve for userspace oom-killer
On 4/21/21 8:28 PM, Shakeel Butt wrote:
> On Wed, Apr 21, 2021 at 10:06 AM peter enderborg
> <peter.enderborg@...y.com> wrote:
>> On 4/20/21 3:44 AM, Shakeel Butt wrote:
> [...]
>> I think this is the wrong way to go.
> Which one? Are you talking about the kernel one? We already talked out
> of that. To decide to OOM, we need to look at a very diverse set of
> metrics and it seems like that would be very hard to do flexibly
> inside the kernel.
You dont need to decide to oom, but when oom occurs you
can take a proper action.
>
>> I sent a patch for android lowmemorykiller some years ago.
>>
>> https://urldefense.com/v3/__http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2017-February/100319.html__;!!JmoZiZGBv3RvKRSx!pwmY7R1kGPkZq95bHSObHqIR1-r3ItSBgdRBdKym9uCcUprGq-CUrAIaH946vWJqrjU$
>>
>> It has been improved since than, so it can act handle oom callbacks, it can act on vmpressure and psi
>> and as a shrinker. The patches has not been ported to resent kernels though.
>>
>> I don't think vmpressure and psi is that relevant now. (They are what userspace act on) But the basic idea is to have a priority queue
>> within the kernel. It need pick up new processes and dying process. And then it has a order, and that
>> is set with oom adj values by activity manager in android. I see this model can be reused for
>> something that is between a standard oom and userspace. Instead of vmpressure and psi
>> a watchdog might be a better way. If userspace (in android the activity manager or lmkd) does not kick the watchdog,
>> the watchdog bite the task according to the priority and kills it. This priority list does not have to be a list generated
>> within kernel. But it has the advantage that you inherent parents properties. We use a rb-tree for that.
>>
>> All that is missing is the watchdog.
>>
> Actually no. It is missing the flexibility to monitor metrics which a
> user care and based on which they decide to trigger oom-kill. Not sure
> how will watchdog replace psi/vmpressure? Userspace keeps petting the
> watchdog does not mean that system is not suffering.
The userspace should very much do what it do. But when it
does not do what it should do, including kick the WD. Then
the kernel kicks in and kill a pre defined process or as many
as needed until the monitoring can start to kick and have the
control.
>
> In addition oom priorities change dynamically and changing it in your
> system seems very hard. Cgroup awareness is missing too.
Why is that hard? Moving a object in a rb-tree is as good it get.
>
> Anyways, there are already widely deployed userspace oom-killer
> solutions (lmkd, oomd). I am aiming to further improve the
> reliability.
Yes, and I totally agree that it is needed. But I don't think
it will possible until linux is realtime ready, including a
memory system that can guarantee allocation times.
Powered by blists - more mailing lists