lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAMbhsRRUNVn9_eL_obZxcZv7LqDW_8SeRApdKzDePbaW_Y4uvg@mail.gmail.com>
Date:	Mon, 9 Apr 2012 15:13:00 -0700
From:	Colin Cross <ccross@...gle.com>
To:	David Rientjes <rientjes@...gle.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	werner <w.landgraf@...ru>, Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>,
	Rabin Vincent <rabin.vincent@...ricsson.com>,
	Christian Bejram <christian.bejram@...ricsson.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Anton Vorontsov <anton.vorontsov@...aro.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	stable@...r.kernel.org
Subject: Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

On Mon, Apr 9, 2012 at 2:22 PM, David Rientjes <rientjes@...gle.com> wrote:
> On Mon, 9 Apr 2012, Linus Torvalds wrote:
>
>> The real bug is actually that those notifiers are a f*cking joke, and
>> the return value from the notifier is a mistake.
>>
>> So I personally think that the real problem is this code in
>> profile_handoff_task:
>>
>>         return (ret == NOTIFY_OK) ? 1 : 0;
>>
>> and ask yourself two questions:
>>
>>  - what the hell does NOTIFY_OK/NOTIFY_DONE mean?
>>  - what happens if there are multiple notifiers that all (or some)
>> return NOTIFY_OK?
>>
> NOTIFY_OK should never be a valid response for this notifier the way it's
> currently implemented, it should be NOTIFY_STOP to stop iterating the call
> chain to avoid a double free.  Right now it doesn't matter because only
> oprofile is actually freeing the task_struct and lowmemorykiller should be
> using NOTIFY_DONE.
>
> Then we have a completeness issue if multiple callbacks want to return
> NOTIFY_STOP and an ordering issue if the oprofile callback is invoked
> before lowmemorykiller.
>
>> I'll tell you what my answers are:
>>
>>  (a) NOTIFY_DONE is the "ok, everything is fine, you can free the
>> task-struct". It's also what that handoff notifier thing returns if
>> there are no notifiers registered at all.
>>
>>      So the fix to the Android lowmemorykiller is as simple as just
>> changing NOTIFY_OK to NOTIFY_DONE, which will mean that the caller
>> will properly free the task struct.
>>
>
> I don't think so for Werner's config who also has CONFIG_OPROFILE=y, so
> oprofile would return NOTIFY_OK and queue the task_struct for free, then
> the second notifier callback to the lowmemorykiller would return
> NOTIFY_DONE which would result in put_task_struct() doing free_task()
> itself for a double free.
>
>>      The NOTIFY_OK/NOTIFY_DONE difference really does seem to be just
>> "NOTIFY_OK means that I will free the task myself later". That's what
>> the oprofile uses, and it frees the task.
>>
>>  (b) But the whole interface is a total f*cking mess. If *multiple*
>> people return NOTIFY_OK, they're royally fucked. And the whole (and
>> only) point of notifiers is that you can register multiple different
>> ones independently.
>>
>> So quite frankly, the *real* bug is not in that android driver
>> (although I'd say that we should just make it return NOTIFY_DONE and
>> be done with it). The real bug is that the whole f*cking notifier is a
>> mistake, and checking the error return was the biggest mistake of all.
>>
>
> Right, we can't handoff the freeing of the task_struct to more than one
> notifier.  It seems misdesigned from the beginning and what we really want
> is to hijack task->usage for __put_task_struct(task) if we have such a
> notifier callchain and require each one (currently just oprofile) to take
> a reference on task->usage for NOTIFY_OK and then be responsible for
> dropping the reference when it's done with it later instead of requiring
> it to free the task_struct itself.
>
> That's _if_ we want to continue to have such an interface in the first
> place where it's only really necessary right now for oprofile (and, hence,
> wasn't implemented in an extendable way).  I'm thinking the
> lowmemorykiller, as I eluded to, could be written in a way where we can
> detect if a thread we've already killed has exited yet before killing
> another one.  We can't just store a pointer to the task_struct of the
> killed task since it could be reused for a fork later, but we could use
> TIF_MEMDIE like the oom killer does.

This was a known issue in 2010, in the android tree the use of
task_handoff_register was dropped one day after it was added and
replaced with a new task_free_register hook.  I assume Greg dropped
the fix during the android tree refresh in 3.0 because it depended on
a change to kernel/fork.c.  The two relevant patches are (using
codeaurora's gitweb becase we don't have one right now):

sched: Add a generic notifier when a task struct is about to be freed
https://www.codeaurora.org/gitweb/quic/la/?p=kernel/common.git;a=commitdiff;h=667dffa787a87ef4ea43cc65957ce96077fdcd0a

staging: android: lowmemorykiller: Fix task_struct leak
https://www.codeaurora.org/gitweb/quic/la/?p=kernel/common.git;a=commitdiff;h=af0240f095a704f75f032bbcc01f670c65c163ba
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ