[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20091214210823.BBAE.A69D9226@jp.fujitsu.com>
Date: Mon, 14 Dec 2009 21:22:24 +0900 (JST)
From: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
To: Rik van Riel <riel@...hat.com>
Cc: kosaki.motohiro@...fujitsu.com, lwoodman@...hat.com,
akpm@...ux-foundation.org, linux-mm@...ck.org,
linux-kernel@...r.kernel.org, minchan.kim@...il.com
Subject: Re: [PATCH v2] vmscan: limit concurrent reclaimers in shrink_zone
> Under very heavy multi-process workloads, like AIM7, the VM can
> get into trouble in a variety of ways. The trouble start when
> there are hundreds, or even thousands of processes active in the
> page reclaim code.
>
> Not only can the system suffer enormous slowdowns because of
> lock contention (and conditional reschedules) between thousands
> of processes in the page reclaim code, but each process will try
> to free up to SWAP_CLUSTER_MAX pages, even when the system already
> has lots of memory free.
>
> It should be possible to avoid both of those issues at once, by
> simply limiting how many processes are active in the page reclaim
> code simultaneously.
>
> If too many processes are active doing page reclaim in one zone,
> simply go to sleep in shrink_zone().
>
> On wakeup, check whether enough memory has been freed already
> before jumping into the page reclaim code ourselves. We want
> to use the same threshold here that is used in the page allocator
> for deciding whether or not to call the page reclaim code in the
> first place, otherwise some unlucky processes could end up freeing
> memory for the rest of the system.
This patch seems very similar to my old effort. afaik, there are another
two benefit.
1. Improve resource gurantee
if thousands tasks start to vmscan at the same time, they eat all memory for
PF_MEMALLOC. it might cause another dangerous problem. some filesystem
and io device don't handle allocation failure properly.
2. Improve OOM contidion behavior
Currently, vmscan don't handle SIGKILL at all. then if the system
have hevy memory pressure, OOM killer can't kill the target process
soon. it might cause OOM killer kill next innocent process.
This patch can fix it.
> @@ -1600,6 +1612,31 @@ static void shrink_zone(int priority, struct zone *zone,
> struct zone_reclaim_stat *reclaim_stat = get_reclaim_stat(zone, sc);
> int noswap = 0;
>
> + if (!current_is_kswapd() && atomic_read(&zone->concurrent_reclaimers) >
> + max_zone_concurrent_reclaimers &&
> + (sc->gfp_mask & (__GFP_IO|__GFP_FS)) ==
> + (__GFP_IO|__GFP_FS)) {
> + /*
> + * Do not add to the lock contention if this zone has
> + * enough processes doing page reclaim already, since
> + * we would just make things slower.
> + */
> + sleep_on(&zone->reclaim_wait);
Oops. this is bug. sleep_on() is not SMP safe.
I made few fixing patch today. I'll post it soon.
btw, following is mesurement result by hackbench.
================
unit: sec
parameter old new
130 (5200 processes) 5.463 4.442
140 (5600 processes) 479.357 7.792
150 (6000 processes) 729.640 20.529
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists