lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180807201935.GB4251@cmpxchg.org>
Date:   Tue, 7 Aug 2018 16:19:35 -0400
From:   Johannes Weiner <hannes@...xchg.org>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
Cc:     Michal Hocko <mhocko@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>, linux-mm@...ck.org,
        Greg Thelen <gthelen@...gle.com>,
        Dmitry Vyukov <dvyukov@...gle.com>,
        LKML <linux-kernel@...r.kernel.org>,
        Michal Hocko <mhocko@...e.com>,
        David Rientjes <rientjes@...gle.com>
Subject: Re: [PATCH] memcg, oom: be careful about races when warning about no
 reclaimable task

On Tue, Aug 07, 2018 at 07:15:11PM +0900, Tetsuo Handa wrote:
> On 2018/08/07 16:25, Michal Hocko wrote:
> > @@ -1703,7 +1703,8 @@ static enum oom_status mem_cgroup_oom(struct mem_cgroup *memcg, gfp_t mask, int
> >  		return OOM_ASYNC;
> >  	}
> >  
> > -	if (mem_cgroup_out_of_memory(memcg, mask, order))
> > +	if (mem_cgroup_out_of_memory(memcg, mask, order) ||
> > +			tsk_is_oom_victim(current))
> >  		return OOM_SUCCESS;
> >  
> >  	WARN(1,"Memory cgroup charge failed because of no reclaimable memory! "
> > 
> 
> I don't think this patch is appropriate. This patch only avoids hitting WARN(1).
> This patch does not address the root cause:
> 
> The task_will_free_mem(current) test in out_of_memory() is returning false
> because test_bit(MMF_OOM_SKIP, &mm->flags) test in task_will_free_mem() is
> returning false because MMF_OOM_SKIP was already set by the OOM reaper. The OOM
> killer does not need to start selecting next OOM victim until "current thread
> completes __mmput()" or "it fails to complete __mmput() within reasonable
> period".

I don't see why it matters whether the OOM victim exits or not, unless
you count the memory consumed by struct task_struct.

> According to https://syzkaller.appspot.com/text?tag=CrashLog&x=15a1c770400000 ,
> PID=23767 selected PID=23766 as an OOM victim and the OOM reaper set MMF_OOM_SKIP
> before PID=23766 unnecessarily selects PID=23767 as next OOM victim.
> At uptime = 366.550949, out_of_memory() should have returned true without selecting
> next OOM victim because tsk_is_oom_victim(current) == true.

The code works just fine. We have to kill tasks until we a) free
enough memory or b) run out of tasks or c) kill current. When one of
these outcomes is reached, we allow the charge and return.

The only problem here is a warning in the wrong place.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ