lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1103071234480.10264@chino.kir.corp.google.com>
Date:	Mon, 7 Mar 2011 12:36:49 -0800 (PST)
From:	David Rientjes <rientjes@...gle.com>
To:	Andrew Vagin <avagin@...il.com>
cc:	KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
	Andrey Vagin <avagin@...nvz.org>,
	Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH] mm: skip zombie in OOM-killer

On Mon, 7 Mar 2011, Andrew Vagin wrote:

> > Andrey is patching the case where an eligible TIF_MEMDIE process is found
> > but it has already detached its ->mm.  In combination with the patch
> > posted to linux-mm, oom: prevent unnecessary oom kills or kernel panics,
> > which makes select_bad_process() iterate over all threads, it is an
> > effective solution.
> 
> Probably you said about the first version of my patch.
> This version is incorrect because of
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=dd8e8f405ca386c7ce7cbb996ccd985d283b0e03
> 
> but my first patch is correct and it has a simple reproducer(I
> attached it). You can execute it and your kernel hangs up, because the
> parent doesn't wait children, but the one child (zombie) will have
> flag TIF_MEMDIE, oom_killer will kill nobody
> 

The second version of your patch works fine in combination with the 
pending "oom: prevent unnecessary oom kills or kernel panics" patch from 
linux-mm (included below).  Try your test case with both this patch and 
the second version of your patch.

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -292,11 +292,11 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		unsigned long totalpages, struct mem_cgroup *mem,
 		const nodemask_t *nodemask)
 {
-	struct task_struct *p;
+	struct task_struct *g, *p;
 	struct task_struct *chosen = NULL;
 	*ppoints = 0;
 
-	for_each_process(p) {
+	do_each_thread(g, p) {
 		unsigned int points;
 
 		if (oom_unkillable_task(p, mem, nodemask))
@@ -324,7 +324,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 		 * the process of exiting and releasing its resources.
 		 * Otherwise we could get an easy OOM deadlock.
 		 */
-		if (thread_group_empty(p) && (p->flags & PF_EXITING) && p->mm) {
+		if ((p->flags & PF_EXITING) && p->mm) {
 			if (p != current)
 				return ERR_PTR(-1UL);
 
@@ -337,7 +337,7 @@ static struct task_struct *select_bad_process(unsigned int *ppoints,
 			chosen = p;
 			*ppoints = points;
 		}
-	}
+	} while_each_thread(g, p);
 
 	return chosen;
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ