[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4BCE7DD1.70900@linux.vnet.ibm.com>
Date: Wed, 21 Apr 2010 06:23:45 +0200
From: Christian Ehrhardt <ehrhardt@...ux.vnet.ibm.com>
To: Rik van Riel <riel@...hat.com>
CC: Johannes Weiner <hannes@...xchg.org>, Mel Gorman <mel@....ul.ie>,
Andrew Morton <akpm@...ux-foundation.org>, linux-mm@...ck.org,
Nick Piggin <npiggin@...e.de>,
Chris Mason <chris.mason@...cle.com>,
Jens Axboe <jens.axboe@...cle.com>,
linux-kernel@...r.kernel.org, gregkh@...ell.com,
Corrado Zoccolo <czoccolo@...il.com>
Subject: Re: [RFC PATCH 0/3] Avoid the use of congestion_wait under zone pressure
Rik van Riel wrote:
> On 04/20/2010 11:32 AM, Johannes Weiner wrote:
>
>> The idea is that it pans out on its own. If the workload changes, new
>> pages get activated and when that set grows too large, we start shrinking
>> it again.
>>
>> Of course, right now this unscanned set is way too large and we can end
>> up wasting up to 50% of usable page cache on false active pages.
>
> Thing is, changing workloads often change back.
>
> Specifically, think of a desktop system that is doing
> work for the user during the day and gets backed up
> at night.
>
> You do not want the backup to kick the working set
> out of memory, because when the user returns in the
> morning the desktop should come back quickly after
> the screensaver is unlocked.
IMHO it is fine to prevent that nightly backup job from not being
finished when the user arrives at morning because we didn't give him
some more cache - and e.g. a 30 sec transition from/to both optimized
states is fine.
But eventually I guess the point is that both behaviors are reasonable
to achieve - depending on the users needs.
What we could do is combine all our thoughts we had so far:
a) Rik could create an experimental patch that excludes the in flight pages
b) Johannes could create one for his suggestion to "always scan active
file pages but only deactivate them when the ratio is off and otherwise
strip buffers of clean pages"
c) I would extend the patch from Johannes setting the ratio of
active/inactive pages to be a userspace tunable
a,b,a+b would then need to be tested if they achieve a better behavior.
c on the other hand would be a fine tunable to let administrators
(knowing their workloads) or distributions (e.g. different values for
Desktop/Server defaults) adapt their installations.
In theory a,b and c should work fine together in case we need all of them.
> The big question is, what workload suffers from
> having the inactive list at 50% of the page cache?
>
> So far the only big problem we have seen is on a
> very unbalanced virtual machine, with 256MB RAM
> and 4 fast disks. The disks simply have more IO
> in flight at once than what fits in the inactive
> list.
Did I get you right that this means the write case - explaining why it
is building up buffers to the 50% max?
Note: It even uses up to 64 disks, with 1 disk per thread so e.g. 16
threads => 16 disks.
For being "unbalanced" I'd like to mention that over the years I learned
that sometimes, after a while, virtualized systems look that way without
being intended - this happens by adding more and more guests and let
guest memory balooning take care of it.
> This is a very untypical situation, and we can
> probably solve it by excluding the in-flight pages
> from the active/inactive file calculation.
--
GrĂ¼sse / regards, Christian Ehrhardt
IBM Linux Technology Center, System z Linux Performance
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists