lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <43717dc3-b8c8-651a-3d61-019c9752a110@shopee.com>
Date:   Tue, 14 Mar 2023 21:27:56 +0800
From:   Haifeng Xu <haifeng.xu@...pee.com>
To:     Michal Hocko <mhocko@...e.com>
Cc:     shakeelb@...gle.com, hannes@...xchg.org, akpm@...ux-foundation.org,
        linux-mm@...ck.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH RESEND] mm/oom_kill: don't kill exiting tasks in
 oom_kill_memcg_member



On 2023/3/14 20:00, Michal Hocko wrote:
> On Tue 14-03-23 19:07:27, Haifeng Xu wrote:
>>
>>
>> On 2023/3/14 18:16, Michal Hocko wrote:
>>> On Tue 14-03-23 18:07:42, Haifeng Xu wrote:
>>>>
>>>>
>>>> On 2023/3/14 17:19, Michal Hocko wrote:
>>>>> On Tue 14-03-23 09:11:36, Haifeng Xu wrote:
>>>>>> If oom_group is set, oom_kill_process() invokes oom_kill_memcg_member()
>>>>>> to kill all processes in the memcg. When scanning tasks in memcg, maybe
>>>>>> the provided task is marked as oom victim. Also, some tasks are likely
>>>>>> to release their address space. There is no need to kill the exiting tasks.
>>>>>
>>>>> This doesn't state any actual problem. Could you be more specific? Is
>>>>> this a bug fix, a behavior change or an optimization?
>>>>
>>>>
>>>> 1) oom_kill_process() has inovked __oom_kill_process() to kill the selected victim, but it will be scanned
>>>> in mem_cgroup_scan_tasks(). It's pointless to kill the victim twice. 
>>>
>>> Why does that matter though? The purpose of task_will_free_mem in
>>> oom_kill_process is different. It would bail out from a potentially
>>> noisy OOM report when the selected oom victim is expected to terminate
>>> soon. __oom_kill_process called for the whole memcg doesn't aim at
>>> avoiding any oom victims. It merely sends a kill signal too all of them.
>>>
>>
>> except sending kill signals, __oom_kill_process() will do some other work, such as print messeages, traversal all 
>> all user processes sharing mm which holds RCU section and so on. So if skip the victim, we don't need those work again
>> and it won't affect the original mechanism. All oom victims are still get killed. 
> 
> mm sharing among processes is a very rare thing but do not forget that
> task_will_free_mem needs to do the same thing for the same reason.

For the victim, __oom_kill_process() traversals all processes in the system whether there some other tasks sharing mm or not.
If skip it, this work can be dropped.

> 
>>>> 2) for those exiting processes, reaping them directly is also a faster way to free memory compare with invoking
>>>> __oom_kill_process().
>>>
>>> Is it? What if the terminating task is blocked on lock? Async oom
>>> reaping might release those resources in that case.
>>
>> Yes, the reaping process is asynchronous. I mean we don't need the work mentioned above any more.
>> "reaping them directly" here is that joining the task in oom reaper queue.
> 
> I do not follow.
> 
> In any case I still do not see any actual justification for the change
> other than "we can do it and it might turn out less expensive". This
> alone is not sufficient, just be explicit, because oom is hardly a fast
> path to optimize every single cpu cycle for. So unless you see an actual
> real life problem that would be behaving much better or even fixed then
> I am not convinced this is a worthwhile change to have.
> 

we can also see two same messages("Memory cgroup out of memory: Killed process ***")about the victim.
This seems a little confusing. If skip the victim, only one message was printed.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ