linux-kernel - Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAMbhsRSch1gJLRzPo1p8pCySsB9AJ01wUeVYKXf+zw31pnxumw@mail.gmail.com>
Date:	Mon, 9 Apr 2012 18:21:50 -0700
From:	Colin Cross <ccross@...gle.com>
To:	David Rientjes <rientjes@...gle.com>
Cc:	Linus Torvalds <torvalds@...ux-foundation.org>,
	Andrew Morton <akpm@...ux-foundation.org>,
	werner <w.landgraf@...ru>, Rik van Riel <riel@...hat.com>,
	Hugh Dickins <hughd@...gle.com>, linux-kernel@...r.kernel.org,
	Oleg Nesterov <oleg@...hat.com>,
	Rabin Vincent <rabin.vincent@...ricsson.com>,
	Christian Bejram <christian.bejram@...ricsson.com>,
	"Paul E. McKenney" <paulmck@...ux.vnet.ibm.com>,
	Anton Vorontsov <anton.vorontsov@...aro.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	stable@...r.kernel.org
Subject: Re: v3.4-rc2 out-of-memory problems (was Re: 3.4-rc1 sticks-and-crashs)

On Mon, Apr 9, 2012 at 5:32 PM, David Rientjes <rientjes@...gle.com> wrote:
> On Mon, 9 Apr 2012, Colin Cross wrote:
>
>> The point of the lowmem_deathpending patch was to avoid a stutter
>> where the cpu would spend its time looping through the tasks due to
>> repeated calls to lowmem_shrink instead of processing the kill signal
>> to the selected thread.
>
> What did you do to avoid this without CONFIG_PROFILING?
>
>> With this patch, it will still loop through
>> tasks until it finds the one that was previously killed and then
>> abort.  It's possible that the improvements Anton made to the task
>> loop reduce the performance impact enough that this whole mess could
>> just be dropped (by reverting 1eda516, e5d7965, and 4755b72).
>>
>
> I don't understand how calling shrink_slab() from direct reclaim or using
> drop_caches manually taking slightly longer because it has to iterate the
> tasklist to the point of the killed thread will significantly stall the
> thread from exiting.

Before Anton's fix, iterating the tasklist involved taking every task
lock, which probably made it very expensive.  I tried a quick test
where I deliberately limited memory to the point that it was
triggering lowmemorykiller during boot, and it triggered about 5000
times taking on the order of 50ms total for all 5000 calls.  It was
about the same with your patch applied.

> Much more likely is the killed thread cannot exit because you've killed it
> in a lowmem situation without giving it access to memory reserves so that
> it may exit quickly as my patch does.  That has a higher liklihood of
> stalling the exit than doing for_each_process().
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/