[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20160219194128.GA17342@cmpxchg.org>
Date: Fri, 19 Feb 2016 14:41:28 -0500
From: Johannes Weiner <hannes@...xchg.org>
To: Rik van Riel <riel@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
Mel Gorman <mgorman@...e.de>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, kernel-team@...com
Subject: Re: [PATCH] mm: scale kswapd watermarks in proportion to memory
On Thu, Feb 18, 2016 at 03:15:43PM -0500, Rik van Riel wrote:
> On Thu, 2016-02-18 at 11:41 -0500, Johannes Weiner wrote:
> > In machines with 140G of memory and enterprise flash storage, we have
> > seen read and write bursts routinely exceed the kswapd watermarks and
> > cause thundering herds in direct reclaim. Unfortunately, the only way
> > to tune kswapd aggressiveness is through adjusting min_free_kbytes -
> > the system's emergency reserves - which is entirely unrelated to the
> > system's latency requirements. In order to get kswapd to maintain a
> > 250M buffer of free memory, the emergency reserves need to be set to
> > 1G. That is a lot of memory wasted for no good reason.
> >
> > On the other hand, it's reasonable to assume that allocation bursts
> > and overall allocation concurrency scale with memory capacity, so it
> > makes sense to make kswapd aggressiveness a function of that as well.
> >
> > Change the kswapd watermark scale factor from the currently fixed 25%
> > of the tunable emergency reserve to a tunable 0.001% of memory.
> >
> > On a 140G machine, this raises the default watermark steps - the
> > distance between min and low, and low and high - from 16M to 143M.
>
> This is an excellent idea for a large system,
> but your patch reduces the gap between watermarks
> on small systems.
>
> On an 8GB zone, your patch halves the gap between
> the watermarks, and on smaller systems it would be
> even worse.
You're right, I'll address that in v2.
> Would it make sense to keep using the old calculation
> on small systems, when the result of the old calculation
> exceeds that of the new calculation?
>
> Using the max of the two calculations could prevent
> the issue you are trying to prevent on large systems,
> from happening on smaller systems.
Yes, I think enforcing a reasonable minimum this way makes sense.
Thanks Rik.
Powered by blists - more mailing lists