[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b3ff8456-fe0e-95a0-cccd-e94025a82560@bytedance.com>
Date: Mon, 26 Sep 2022 11:38:10 +0800
From: Gang Li <ligang.bdlg@...edance.com>
To: David Rientjes <rientjes@...gle.com>
Cc: Zefan Li <lizefan.x@...edance.com>, Tejun Heo <tj@...nel.org>,
Johannes Weiner <hannes@...xchg.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Michal Hocko <mhocko@...e.com>, cgroups@...r.kernel.org,
linux-kernel@...r.kernel.org, linux-mm@...ck.org
Subject: Re: Re: [RFC PATCH v1] mm: oom: introduce cpuset oom
On 2022/9/23 03:18, David Rientjes wrote:
> On Wed, 21 Sep 2022, Gang Li wrote:
>
>> cpuset confine processes to processor and memory node subsets.
>> When a process in cpuset triggers oom, it may kill a completely
>> irrelevant process on another numa node, which will not release any
>> memory for this cpuset.
>>
>> It seems that `CONSTRAINT_CPUSET` is not really doing much these
>> days. Using CONSTRAINT_CPUSET, we can easily achieve node aware oom
>> killing by selecting victim from the cpuset which triggers oom.
>>
>> Suggested-by: Michal Hocko <mhocko@...e.com>
>> Signed-off-by: Gang Li <ligang.bdlg@...edance.com>
>
> Hmm, is this the right approach?
>
> If a cpuset results in a oom condition, is there a reason why we'd need to
> find a process from within that cpuset to kill? I think the idea is to
> free memory on the oom set of nodes (cpuset.mems) and that can happen by
> killing a process that is not a member of this cpuset.
>
Hi,
My last patch implemented this idea[1][2]. But it needs to inc/dec a per
mm_struct counter on every page allocation/release/migration.
As the Unixbench show, this takes 0%-3% performance loss on different
workloads[2]. So Michal Hocko inspired me to use cpuset[3].
[1].
https://lore.kernel.org/all/20220512044634.63586-1-ligang.bdlg@bytedance.com/
[2].
https://lore.kernel.org/all/20220708082129.80115-1-ligang.bdlg@bytedance.com/
[3]. https://lore.kernel.org/all/YoJ%2FioXwGTdCywUE@dhcp22.suse.cz/
> I understand the challenges of creating a NUMA aware oom killer to target
> memory that is actually resident on an oom node, but this approach doesn't
> seem right and could actually lead to pathological cases where a small
> process trying to fork in an otherwise empty cpuset is repeatedly oom
> killing when we'd actually prefer to kill a single large process.
>
I think there are three ways to achieve NUMA aware oom killer:
1. Count every page operations, which cause performance loss[2].
2. Iterate over pages(like show_numa_map) for all processes, which may
stuck oom.
3. Select victim in a cpuset, which may leads to pathological kill.(this
patch)
None of them are perfect and I'm getting stuck, do you have any ideas?
Powered by blists - more mailing lists