linux-kernel - Re: [PATCH 3/3] mm: slub: Default slub_max

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20110513105551.GE3569@suse.de>
Date:	Fri, 13 May 2011 11:55:51 +0100
From:	Mel Gorman <mgorman@...e.de>
To:	James Bottomley <James.Bottomley@...senPartnership.com>
Cc:	Johannes Weiner <hannes@...xchg.org>,
	Pekka Enberg <penberg@...nel.org>,
	Christoph Lameter <cl@...ux.com>,
	Andrew Morton <akpm@...ux-foundation.org>,
	Colin King <colin.king@...onical.com>,
	Raghavendra D Prabhu <raghu.prabhu13@...il.com>,
	Jan Kara <jack@...e.cz>, Chris Mason <chris.mason@...cle.com>,
	Rik van Riel <riel@...hat.com>,
	linux-fsdevel <linux-fsdevel@...r.kernel.org>,
	linux-mm <linux-mm@...ck.org>,
	linux-kernel <linux-kernel@...r.kernel.org>,
	linux-ext4 <linux-ext4@...r.kernel.org>
Subject: Re: [PATCH 3/3] mm: slub: Default slub_max_order to 0

On Thu, May 12, 2011 at 07:47:05PM -0500, James Bottomley wrote:
> On Fri, 2011-05-13 at 00:15 +0200, Johannes Weiner wrote:
> > On Thu, May 12, 2011 at 05:04:41PM -0500, James Bottomley wrote:
> > > On Thu, 2011-05-12 at 15:04 -0500, James Bottomley wrote:
> > > > Confirmed, I'm afraid ... I can trigger the problem with all three
> > > > patches under PREEMPT.  It's not a hang this time, it's just kswapd
> > > > taking 100% system time on 1 CPU and it won't calm down after I unload
> > > > the system.
> > > 
> > > Just on a "if you don't know what's wrong poke about and see" basis, I
> > > sliced out all the complex logic in sleeping_prematurely() and, as far
> > > as I can tell, it cures the problem behaviour.  I've loaded up the
> > > system, and taken the tar load generator through three runs without
> > > producing a spinning kswapd (this is PREEMPT).  I'll try with a
> > > non-PREEMPT kernel shortly.
> > > 
> > > What this seems to say is that there's a problem with the complex logic
> > > in sleeping_prematurely().  I'm pretty sure hacking up
> > > sleeping_prematurely() just to dump all the calculations is the wrong
> > > thing to do, but perhaps someone can see what the right thing is ...
> > 
> > I think I see the problem: the boolean logic of sleeping_prematurely()
> > is odd.  If it returns true, kswapd will keep running.  So if
> > pgdat_balanced() returns true, kswapd should go to sleep.
> > 
> > This?
> 
> I was going to say this was a winner, but on the third untar run on
> non-PREEMPT, I hit the kswapd livelock.  It's got much farther than
> previous attempts, which all hang on the first run, but I think the
> essential problem is still (at least on this machine) that
> sleeping_prematurely() is doing too much work for the wakeup storm that
> allocators are causing.
> 
> Something that ratelimits the amount of time we spend in the watermark
> calculations, like the below (which incorporates your pgdat fix) seems
> to be much more stable (I've not run it for three full runs yet, but
> kswapd CPU time is way lower so far).
> 
> The heuristic here is that if we're making the calculation more than ten
> times in 1/10 of a second, stop and sleep anyway.
> 

Is that heuristic not basically the same as this?

diff --git a/mm/vmscan.c b/mm/vmscan.c
index af24d1e..4d24828 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2251,6 +2251,10 @@ static bool sleeping_prematurely(pg_data_t *pgdat, int order, long remaining,
 	unsigned long balanced = 0;
 	bool all_zones_ok = true;
 
+	/* If kswapd has been running too long, just sleep */
+	if (need_resched())
+		return false;
+
 	/* If a direct reclaimer woke kswapd within HZ/10, it's premature */
 	if (remaining)
 		return true;

-- 
Mel Gorman
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/