lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:	Mon, 9 Apr 2012 08:39:52 -0700
From:	Linus Torvalds <torvalds@...ux-foundation.org>
To:	David Rientjes <rientjes@...gle.com>
Cc:	Andrew Morton <akpm@...ux-foundation.org>,
	werner <w.landgraf@...ru>, Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>,
	Rabin Vincent <rabin.vincent@...ricsson.com>,
	Christian Bejram <christian.bejram@...ricsson.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Anton Vorontsov <anton.vorontsov@...aro.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	stable@...r.kernel.org
Subject: Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

On Mon, Apr 9, 2012 at 3:15 AM, David Rientjes <rientjes@...gle.com> wrote:
>
> I think you nailed it.
>
> I suspect the problem is 1eda5166c764 ("staging: android/lowmemorykiller:
> Don't unregister notifier from atomic context") merged during the 3.4
> merge window and, unfortunately, backported to stable.

Ok. That does seem to match everything.

However, I think your patch is the wrong one.

The real bug is actually that those notifiers are a f*cking joke, and
the return value from the notifier is a mistake.

So I personally think that the real problem is this code in
profile_handoff_task:

        return (ret == NOTIFY_OK) ? 1 : 0;

and ask yourself two questions:

 - what the hell does NOTIFY_OK/NOTIFY_DONE mean?
 - what happens if there are multiple notifiers that all (or some)
return NOTIFY_OK?

I'll tell you what my answers are:

 (a) NOTIFY_DONE is the "ok, everything is fine, you can free the
task-struct". It's also what that handoff notifier thing returns if
there are no notifiers registered at all.

     So the fix to the Android lowmemorykiller is as simple as just
changing NOTIFY_OK to NOTIFY_DONE, which will mean that the caller
will properly free the task struct.

     The NOTIFY_OK/NOTIFY_DONE difference really does seem to be just
"NOTIFY_OK means that I will free the task myself later". That's what
the oprofile uses, and it frees the task.

 (b) But the whole interface is a total f*cking mess. If *multiple*
people return NOTIFY_OK, they're royally fucked. And the whole (and
only) point of notifiers is that you can register multiple different
ones independently.

So quite frankly, the *real* bug is not in that android driver
(although I'd say that we should just make it return NOTIFY_DONE and
be done with it). The real bug is that the whole f*cking notifier is a
mistake, and checking the error return was the biggest mistake of all.

Werner: just test David's patch (do *not* change both the error value
*and* apply David's patch - that would free the task-struct twice). I
don't think his patch is what I want to apply eventually, but it
should fix the issue.

Sadly, I don't think we have anybody who really "owns"
kernel/profile.c - the thing is broken, it was misdesigned, and nobody
really cares. Which is why we'll probably have to fix this by just
making that Android thing return NOTIFY_DONE, and just accept that the
whole thing is a f*cking joke.

                         Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ