linux-kernel - Re: [PATCH] mm: wait for congestion to clear on all zones

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <50EDEC01.7090807@iskon.hr>
Date:	Wed, 09 Jan 2013 23:15:29 +0100
From:	Zlatko Calusic <zlatko.calusic@...on.hr>
To:	Andrew Morton <akpm@...ux-foundation.org>
CC:	Mel Gorman <mgorman@...e.de>, Hugh Dickins <hughd@...gle.com>,
	Minchan Kim <minchan.kim@...il.com>,
	linux-mm <linux-mm@...ck.org>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] mm: wait for congestion to clear on all zones

On 09.01.2013 22:48, Andrew Morton wrote:
> On Wed, 09 Jan 2013 22:41:48 +0100
> Zlatko Calusic <zlatko.calusic@...on.hr> wrote:
>
>> Currently we take a short nap (HZ/10) and wait for congestion to clear
>> before taking another pass with lower priority in balance_pgdat(). But
>> we do that only for the highest zone that we encounter is unbalanced
>> and congested.
>>
>> This patch changes that to wait on all congested zones in a single
>> pass in the hope that it will save us some scanning that way. Also we
>> take a nap as soon as congested zone is encountered and sc.priority <
>> DEF_PRIORITY - 2 (aka kswapd in trouble).
>>
>> ...
>>
>> The patch is against the mm tree. Make sure that
>> mm-avoid-calling-pgdat_balanced-needlessly.patch is applied first (not
>> yet in the mmotm tree). Tested on half a dozen systems with different
>> workloads for the last few days, working really well!
>
> But what are the user-observable effcets of this change?  Less kernel
> CPU consumption, presumably?  Did you quantify it?
>

I have an observation that without it, under some circumstances that are 
VERY HARD to repeat (many days need to pass and some stars to align to 
see the effect), the page cache gets hit hard, 2/3 of it evicted in a 
split second. And it's not even under high load! So, I'm still 
monitoring it, but so far the memory utilization really seems better 
with the patch applied (no more mysterious page cache shootdowns).

Other than that, it just seems more correct to wait on all congested 
zones, not just the highest one. When I sent my first patch that 
replaced congestion_wait() I didn't have much time to do elaborate 
analysis (3.7.0 was released in a matter of hours). So, I just plugged 
the hole and continued working on the proper solution.

I do think that this is my last patch in this particular area 
(balance_pgdat() & friends). But, I'll continue investigating for the 
root cause of this interesting debalance that happens only on this 
particular system. Because I think balance_pgdat() behaviour was just 
revealing it, but the real problem is somewhere else.
-- 
Zlatko
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/