lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170622165858.GA30035@castle>
Date:   Thu, 22 Jun 2017 17:58:58 +0100
From:   Roman Gushchin <guro@...com>
To:     Tetsuo Handa <penguin-kernel@...ove.sakura.ne.jp>
CC:     <linux-mm@...ck.org>, Michal Hocko <mhocko@...nel.org>,
        Vladimir Davydov <vdavydov.dev@...il.com>,
        Johannes Weiner <hannes@...xchg.org>,
        Tejun Heo <tj@...nel.org>, <kernel-team@...com>,
        <cgroups@...r.kernel.org>, <linux-doc@...r.kernel.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [v3 1/6] mm, oom: use oom_victims counter to synchronize oom
 victim selection

On Thu, Jun 22, 2017 at 09:40:28AM +0900, Tetsuo Handa wrote:
> Roman Gushchin wrote:
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -992,6 +992,13 @@ bool out_of_memory(struct oom_control *oc)
> >  	if (oom_killer_disabled)
> >  		return false;
> >  
> > +	/*
> > +	 * If there are oom victims in flight, we don't need to select
> > +	 * a new victim.
> > +	 */
> > +	if (atomic_read(&oom_victims) > 0)
> > +		return true;
> > +
> >  	if (!is_memcg_oom(oc)) {
> >  		blocking_notifier_call_chain(&oom_notify_list, 0, &freed);
> >  		if (freed > 0)
> 
> Above in this patch and below in patch 5 are wrong.
> 
> > --- a/mm/oom_kill.c
> > +++ b/mm/oom_kill.c
> > @@ -665,7 +672,13 @@ static void mark_oom_victim(struct task_struct *tsk)
> >  	 * that TIF_MEMDIE tasks should be ignored.
> >  	 */
> >  	__thaw_task(tsk);
> > -	atomic_inc(&oom_victims);
> > +
> > +	/*
> > +	 * If there are no oom victims in flight,
> > +	 * give the task an access to the memory reserves.
> > +	 */
> > +	if (atomic_inc_return(&oom_victims) == 1)
> > +		set_tsk_thread_flag(tsk, TIF_MEMDIE);
> >  }
> >  
> >  /**
> 
> The OOM reaper is not available for CONFIG_MMU=n kernels, and timeout based
> giveup is not permitted, but a multithreaded process might be selected as
> an OOM victim. Not setting TIF_MEMDIE to all threads sharing an OOM victim's
> mm increases possibility of preventing some OOM victim thread from terminating
> (e.g. one of them cannot leave __alloc_pages_slowpath() with mmap_sem held for
> write due to waiting for the TIF_MEMDIE thread to call exit_oom_victim() when
> the TIF_MEMDIE thread is waiting for the thread with mmap_sem held for write).

I agree, that CONFIG_MMU=n is a special case, and the proposed approach can't
be used directly. But can you, please, why do you find the first  chunk wrong?
Basically, it's exactly what we do now: we increment oom_victims for every oom
victim, and every process decrements this counter during it's exit path.
If there is at least one existing victim, we will select it again, so it's just
an optimization. Am I missing something? Why should we start new victim selection
if there processes that will likely quit and release memory soon?

Thanks!

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ