[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZNCsIm+RK0LStUA6@dhcp22.suse.cz>
Date: Mon, 7 Aug 2023 10:32:34 +0200
From: Michal Hocko <mhocko@...e.com>
To: Chuyi Zhou <zhouchuyi@...edance.com>
Cc: Alan Maguire <alan.maguire@...cle.com>, hannes@...xchg.org,
roman.gushchin@...ux.dev, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, muchun.song@...ux.dev, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org, wuyun.abel@...edance.com,
robin.lu@...edance.com
Subject: Re: [RFC PATCH 1/2] mm, oom: Introduce bpf_select_task
On Sat 05-08-23 07:55:56, Chuyi Zhou wrote:
> Hello,
>
> 在 2023/8/4 19:34, Alan Maguire 写道:
[...]
> > I don't know anything about OOM mechanisms, so maybe it's just me, but I
> > found this confusing. Relying on the previous iteration to control
> > current iteration behaviour seems risky - even if BPF found a victim in
> > iteration N, it's no guarantee it will in iteration N+1.
> >
> The current kernel's OOM actually works like this:
>
> 1. if we first find a valid candidate victim A in iteration N, we would
> record it in oc->chosen.
>
> 2. In iteration N + 1, N+2..., we just compare oc->chosen with the current
> iterating task. Suppose we think current task B is better than
> oc->chosen(A), we would set oc->chosen = B and we would not consider A
> anymore.
>
> IIUC, most policy works like this. We just need to find the *most* suitable
> victim. Normally, if in current iteration we drop A and select B, we would
> not consider A anymore.
Yes, we iterate over all tasks in the specific oom domain (all tasks for
global and all members of memcg tree for hard limit oom). The in-tree
oom policy has to iterate all tasks to achieve some of its goals (like
preventing overkilling while the previously selected victim is still on
the way out). Also oom_score_adj might change the final decision so you
have to really check all eligible tasks.
I can imagine a BPF based policy could be less constrained and as Roman
suggested have a pre-selected victims on stand by. I do not see problem
to have break like mode. Similar to current abort without a canceling an
already noted victim.
--
Michal Hocko
SUSE Labs
Powered by blists - more mailing lists