linux-kernel - Re: [PATCH v3] memcg: fix soft lockup in the OOM process

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <58caaa4f-cf78-4d0f-af31-8a9277b6ebf5@huaweicloud.com>
Date: Mon, 13 Jan 2025 14:51:55 +0800
From: Chen Ridong <chenridong@...weicloud.com>
To: Vlastimil Babka <vbabka@...e.cz>, akpm@...ux-foundation.org,
 mhocko@...nel.org, hannes@...xchg.org, yosryahmed@...gle.com,
 roman.gushchin@...ux.dev, shakeel.butt@...ux.dev, muchun.song@...ux.dev,
 davidf@...eo.com, handai.szj@...bao.com, rientjes@...gle.com,
 kamezawa.hiroyu@...fujitsu.com, RCU <rcu@...r.kernel.org>
Cc: linux-mm@...ck.org, linux-kernel@...r.kernel.org,
 cgroups@...r.kernel.org, chenridong@...wei.com, wangweiyang2@...wei.com
Subject: Re: [PATCH v3] memcg: fix soft lockup in the OOM process



On 2025/1/6 16:45, Vlastimil Babka wrote:
> On 12/24/24 03:52, Chen Ridong wrote:
>> From: Chen Ridong <chenridong@...wei.com>
> 
> +CC RCU
> 
>> A soft lockup issue was found in the product with about 56,000 tasks were
>> in the OOM cgroup, it was traversing them when the soft lockup was
>> triggered.
>>
>> watchdog: BUG: soft lockup - CPU#2 stuck for 23s! [VM Thread:1503066]
>> CPU: 2 PID: 1503066 Comm: VM Thread Kdump: loaded Tainted: G
>> Hardware name: Huawei Cloud OpenStack Nova, BIOS
>> RIP: 0010:console_unlock+0x343/0x540
>> RSP: 0000:ffffb751447db9a0 EFLAGS: 00000247 ORIG_RAX: ffffffffffffff13
>> RAX: 0000000000000001 RBX: 0000000000000000 RCX: 00000000ffffffff
>> RDX: 0000000000000000 RSI: 0000000000000004 RDI: 0000000000000247
>> RBP: ffffffffafc71f90 R08: 0000000000000000 R09: 0000000000000040
>> R10: 0000000000000080 R11: 0000000000000000 R12: ffffffffafc74bd0
>> R13: ffffffffaf60a220 R14: 0000000000000247 R15: 0000000000000000
>> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007f2fe6ad91f0 CR3: 00000004b2076003 CR4: 0000000000360ee0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>>  vprintk_emit+0x193/0x280
>>  printk+0x52/0x6e
>>  dump_task+0x114/0x130
>>  mem_cgroup_scan_tasks+0x76/0x100
>>  dump_header+0x1fe/0x210
>>  oom_kill_process+0xd1/0x100
>>  out_of_memory+0x125/0x570
>>  mem_cgroup_out_of_memory+0xb5/0xd0
>>  try_charge+0x720/0x770
>>  mem_cgroup_try_charge+0x86/0x180
>>  mem_cgroup_try_charge_delay+0x1c/0x40
>>  do_anonymous_page+0xb5/0x390
>>  handle_mm_fault+0xc4/0x1f0
>>
>> This is because thousands of processes are in the OOM cgroup, it takes a
>> long time to traverse all of them. As a result, this lead to soft lockup
>> in the OOM process.
>>
>> To fix this issue, call 'cond_resched' in the 'mem_cgroup_scan_tasks'
>> function per 1000 iterations. For global OOM, call
>> 'touch_softlockup_watchdog' per 1000 iterations to avoid this issue.
>>
>> Fixes: 9cbb78bb3143 ("mm, memcg: introduce own oom handler to iterate only over its own threads")
>> Signed-off-by: Chen Ridong <chenridong@...wei.com>
>> ---
>>  mm/memcontrol.c | 7 ++++++-
>>  mm/oom_kill.c   | 8 +++++++-
>>  2 files changed, 13 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 65fb5eee1466..46f8b372d212 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -1161,6 +1161,7 @@ void mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
>>  {
>>  	struct mem_cgroup *iter;
>>  	int ret = 0;
>> +	int i = 0;
>>  
>>  	BUG_ON(mem_cgroup_is_root(memcg));
>>  
>> @@ -1169,8 +1170,12 @@ void mem_cgroup_scan_tasks(struct mem_cgroup *memcg,
>>  		struct task_struct *task;
>>  
>>  		css_task_iter_start(&iter->css, CSS_TASK_ITER_PROCS, &it);
>> -		while (!ret && (task = css_task_iter_next(&it)))
>> +		while (!ret && (task = css_task_iter_next(&it))) {
>> +			/* Avoid potential softlockup warning */
>> +			if ((++i & 1023) == 0)
>> +				cond_resched();
>>  			ret = fn(task, arg);
>> +		}
>>  		css_task_iter_end(&it);
>>  		if (ret) {
>>  			mem_cgroup_iter_break(memcg, iter);
>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>> index 1c485beb0b93..044ebab2c941 100644
>> --- a/mm/oom_kill.c
>> +++ b/mm/oom_kill.c
>> @@ -44,6 +44,7 @@
>>  #include <linux/init.h>
>>  #include <linux/mmu_notifier.h>
>>  #include <linux/cred.h>
>> +#include <linux/nmi.h>
>>  
>>  #include <asm/tlb.h>
>>  #include "internal.h"
>> @@ -430,10 +431,15 @@ static void dump_tasks(struct oom_control *oc)
>>  		mem_cgroup_scan_tasks(oc->memcg, dump_task, oc);
>>  	else {
>>  		struct task_struct *p;
>> +		int i = 0;
>>  
>>  		rcu_read_lock();
>> -		for_each_process(p)
>> +		for_each_process(p) {
>> +			/* Avoid potential softlockup warning */
>> +			if ((++i & 1023) == 0)
>> +				touch_softlockup_watchdog();
> 
> This might suppress the soft lockup, but won't a rcu stall still be detected?

Yes, rcu stall was still detected.
For global OOM, system is likely to struggle, do we have to do some
works to suppress RCU detete?

Best regards,
Ridong

> 
>>  			dump_task(p, oc);
>> +		}
>>  		rcu_read_unlock();
>>  	}
>>  }
>