lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <alpine.DEB.2.00.1204090235350.14879@chino.kir.corp.google.com>
Date:	Mon, 9 Apr 2012 03:15:08 -0700 (PDT)
From:	David Rientjes <rientjes@...gle.com>
To:	Linus Torvalds <torvalds@...ux-foundation.org>
cc:	Andrew Morton <akpm@...ux-foundation.org>,
	werner <w.landgraf@...ru>, Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>,
	Rabin Vincent <rabin.vincent@...ricsson.com>,
	Christian Bejram <christian.bejram@...ricsson.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Anton Vorontsov <anton.vorontsov@...aro.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	stable@...r.kernel.org
Subject: Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1
 sticks-and-crashs)

On Sun, 8 Apr 2012, Linus Torvalds wrote:

> Or does anybody see anything that keeps thread counts raised so that
> "free_task()" doesn't get done. kernel/profoe.c does that
> "profile_handoff_task()" thing - but only oprofile and the android
> low-memory-killer logic seems to use it though. But that's exactly the
> kind of thing that Werner's "configure everything" might enable -
> Werner?
> 

I think you nailed it.

I suspect the problem is 1eda5166c764 ("staging: android/lowmemorykiller: 
Don't unregister notifier from atomic context") merged during the 3.4 
merge window and, unfortunately, backported to stable.

Werner's config has CONFIG_ANDROID_LOW_MEMORY_KILLER=y so we never 
actually unregister the callback for the task handoff as a result of the 
patch.  It's supposed to take responsibility for doing free_task() itself 
when it's good and ready, usually by putting it into a list to free, but 
now we're just doing this:

	struct task_struct *task = data;

	if (task == lowmem_deathpending)
		lowmem_deathpending = NULL;

	return NOTIFY_OK;

whenever put_task_struct() decrements the refcount to 0 and thus they get 
leaked and bad things happen.

This is confirmed by Werner's oom log that shows extremely small values 
for the oom score of the task chosen to oom kill.  His first log showed X 
being killed with a score of 29.  That means it is the most memory-hogging 
task on the system and is only using 2.9% of total system memory.

I can't actually see how the lowmemorykiller actually ever freed any 
task_struct after unregistering the notifier during the callback.  It 
seems like this has always leaked memory but it used to happen much more 
slowly because, prior to the patch, we did task_handoff_unregister() in 
the callback.  So I think the code was always wrong but now it's out of 
control because the notifier remains enabled indefinitely.  I can't say 
the 1eda5166c764 ("staging: android/lowmemorykiller: Don't unregister 
notifier from atomic context") commit is fully to blame, it just made the 
error much more egregious.

As it sits in 3.4-rc2, this whole lowmem_deathpending business seems to be 
storing a pointer to the task_struct of something sent a SIGKILL and it 
remains that way until the lowmem_deathpending_timeout expires and 
something else is killed instead.  lowmem_deathpending gets cleared on the 
task handoff if the task selected for kill just exited.  This ensures we 
only kill one thread at a time.

That's all fine and good but it seems like we're never freeing the 
task_struct itself on exit.  This seems like the most obvious fix but it 
would be really nice to revisit this and remove the dependency on 
CONFIG_PROFILING and just check if the lowmem_deathpending thread is found 
in the iteration for lowmem_shrink() prior to killing.


android, lowmemorykiller: free task struct on profiling handoff

The lowmemorykiller stores a pointer to a killed thread's task_struct in 
lowmem_deathpending when profiling is enabled.  When put_task_struct() 
results in the refcount going to 0, the task_notify_func() callback clears 
lowmem_deathpending if it is the thread that was killed last.  This 
prevents additional killing until lowmem_deathpending_timeout elapses.

The responsibility of every task handoff notifier is to free the tasks 
handed off to it, however, and this was being neglected, which results 
in a massive memory leak since no task_struct ever gets freed.

Fix that by freeing the task_struct since we no longer need a reference to 
it.

Reported-by: werner <w.landgraf@...ru>
Cc: stable@...r.kernel.org
Signed-off-by: David Rientjes <rientjes@...gle.com>
---
 drivers/staging/android/lowmemorykiller.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/staging/android/lowmemorykiller.c b/drivers/staging/android/lowmemorykiller.c
--- a/drivers/staging/android/lowmemorykiller.c
+++ b/drivers/staging/android/lowmemorykiller.c
@@ -78,6 +78,7 @@ task_notify_func(struct notifier_block *self, unsigned long val, void *data)
 
 	if (task == lowmem_deathpending)
 		lowmem_deathpending = NULL;
+	free_task(task);
 
 	return NOTIFY_OK;
 }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ