linux-kernel - Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <993e7783-60e9-ba03-b512-c829b9e833fd@i-love.sakura.ne.jp>
Date:   Thu, 12 Mar 2020 07:04:20 +0900
From:   Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
To:     David Rientjes <rientjes@...gle.com>
Cc:     Andrew Morton <akpm@...ux-foundation.org>,
        Vlastimil Babka <vbabka@...e.cz>,
        Michal Hocko <mhocko@...nel.org>, linux-kernel@...r.kernel.org,
        linux-mm@...ck.org
Subject: Re: [patch] mm, oom: prevent soft lockup on memcg oom for UP systems

On 2020/03/12 4:38, David Rientjes wrote:
> On Wed, 11 Mar 2020, Tetsuo Handa wrote:
> 
>>>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>>>> --- a/mm/vmscan.c
>>>>> +++ b/mm/vmscan.c
>>>>> @@ -2637,6 +2637,8 @@ static void shrink_node_memcgs(pg_data_t *pgdat, struct scan_control *sc)
>>>>>  		unsigned long reclaimed;
>>>>>  		unsigned long scanned;
>>>>>  
>>>>> +		cond_resched();
>>>>> +
>>>>
>>>> Is this safe for CONFIG_PREEMPTION case? If current thread has realtime priority,
>>>> can we guarantee that the OOM victim (well, the OOM reaper kernel thread rather
>>>> than the OOM victim ?) gets scheduled?
>>>>
>>>
>>> I think it's the best we can do that immediately solves the issue unless 
>>> you have another idea in mind?
>>
>> "schedule_timeout_killable(1) outside of oom_lock" or "the OOM reaper grabs oom_lock
>> so that allocating threads guarantee that the OOM reaper gets scheduled" or "direct OOM
>> reaping so that allocating threads guarantee that some memory is reclaimed".
>>
> 
> The cond_resched() here is needed if the iteration is lengthy depending on 
> the number of descendant memcgs already.

No. cond_resched() here will become no-op if CONFIG_PREEMPTION=y and current
thread has realtime priority.

> 
> schedule_timeout_killable(1) does not make any guarantees that current 
> will be scheduled after the victim or oom_reaper on UP systems.

The point of schedule_timeout_*(1) is to guarantee that current thread
will yield CPU to other threads even if CONFIG_PREEMPTION=y and current
thread has realtime priority case. There is no guarantee that current
thread will be rescheduled immediately after a sleep is irrelevant.

> 
> If you have an alternate patch to try, we can test it.  But since this 
> cond_resched() is needed anyway, I'm not sure it will change the result.

schedule_timeout_killable(1) is an alternate patch to try; I don't think
that this cond_resched() is needed anyway.

> 
>>>
>>>>>  		switch (mem_cgroup_protected(target_memcg, memcg)) {
>>>>>  		case MEMCG_PROT_MIN:
>>>>>  			/*
>>>>>
>>>>
>>
>>