lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160616155347.GO6836@dhcp22.suse.cz>
Date:	Thu, 16 Jun 2016 17:53:48 +0200
From:	Michal Hocko <mhocko@...nel.org>
To:	Tetsuo Handa <penguin-kernel@...ove.SAKURA.ne.jp>
Cc:	linux-mm@...ck.org, rientjes@...gle.com, oleg@...hat.com,
	vdavydov@...allels.com, akpm@...ux-foundation.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 07/10] mm, oom: fortify task_will_free_mem

On Fri 17-06-16 00:40:41, Tetsuo Handa wrote:
> Michal Hocko wrote:
> > On Thu 16-06-16 21:54:27, Tetsuo Handa wrote:
> > > Michal Hocko wrote:
> > > > On Sat 11-06-16 17:10:03, Tetsuo Handa wrote:
> > [...]
> > > I still don't like it. current->mm == NULL in
> > > 
> > > -	if (current->mm &&
> > > -	    (fatal_signal_pending(current) || task_will_free_mem(current))) {
> > > +	if (task_will_free_mem(current)) {
> > > 
> > > is not highly unlikely. You obviously break commit d7a94e7e11badf84
> > > ("oom: don't count on mm-less current process") on CONFIG_MMU=n kernels.
> > 
> > I still fail to see why you care about that case so much. The heuristic
> > was broken for other reasons before this patch. The patch fixes a class
> > of issues for both mmu and nommu. I can restore the current->mm check
> > for now but the more I am thinking about it the less I am sure the
> > commit you are referring to is evem correct/necessary.
> > 
> > It claims that the OOM killer would be stuck because the child would be
> > sitting in the final schedule() until the parent reaps it. That is not
> > true, though, because victim would be unhashed down in release_task()
> > path so it is not visible by the oom killer when it is waiting for the
> > parent.  I have completely missed that part when reviewing the patch. Or
> > am I missing something...
> 
> That explanation started from 201411292304.CGF68419.MOLHVQtSFFOOJF@...ove.SAKURA.ne.jp
> (Sat, 29 Nov 2014 23:04:33 +0900) in your mailbox. I confirmed that a TIF_MEMDIE
> zombie inside the final schedule() in do_exit() is waiting for parent to reap.
> release_task() will be called when parent noticed that there is a zombie, but
> this OOM livelock situation prevented parent looping inside page allocator waiting
> for that TIF_MEMDIE zombie from noticing that there is a zombie.

I cannot seem to find this msg-id. Anyway, let's forget it for now
to not get side tracked. I have to study that code more deeply to better
understand it.

> > Anyway, would you be OK with the patch if I added the current->mm check
> > and resolve its necessity in a separate patch?
> 
> Please correct task_will_free_mem() in oom_kill_process() as well.

We cannot hold task_lock over all task_will_free_mem I am even not sure
we have to develop an elaborate way to make it raceless just for the nommu
case. The current case is simple as we cannot race here. Is that
sufficient for you?

-- 
Michal Hocko
SUSE Labs

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ