[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <200802191735.00222.nickpiggin@yahoo.com.au>
Date: Tue, 19 Feb 2008 17:34:59 +1100
From: Nick Piggin <nickpiggin@...oo.com.au>
To: KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>
Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@...fujitsu.com>,
Balbir Singh <balbir@...ux.vnet.ibm.com>,
Rik van Riel <riel@...hat.com>,
Lee Schermerhorn <Lee.Schermerhorn@...com>, linux-mm@...ck.org,
linux-kernel@...r.kernel.org
Subject: Re: [RFC][PATCH] the proposal of improve page reclaim by throttle
On Tuesday 19 February 2008 16:44, KOSAKI Motohiro wrote:
> background
> ========================================
> current VM implementation doesn't has limit of # of parallel reclaim.
> when heavy workload, it bring to 2 bad things
> - heavy lock contention
> - unnecessary swap out
>
> abount 2 month ago, KAMEZA Hiroyuki proposed the patch of page
> reclaim throttle and explain it improve reclaim time.
> http://marc.info/?l=linux-mm&m=119667465917215&w=2
>
> but unfortunately it works only memcgroup reclaim.
> Today, I implement it again for support global reclaim and mesure it.
>
>
> test machine, method and result
> ==================================================
> <test machine>
> CPU: IA64 x8
> MEM: 8GB
> SWAP: 2GB
>
> <test method>
> got hackbench from
> http://people.redhat.com/mingo/cfs-scheduler/tools/hackbench.c
>
> $ /usr/bin/time hackbench 120 process 1000
>
> this parameter mean consume all physical memory and
> 1GB swap space on my test environment.
>
> <test result (average of 3 times measurement)>
>
> before:
> hackbench result: 282.30
> /usr/bin/time result
> user: 14.16
> sys: 1248.47
> elapse: 432.93
> major fault: 29026
> max parallel reclaim tasks: 1298
> max consumption time of
> try_to_free_pages(): 70394
>
> after:
> hackbench result: 30.36
> /usr/bin/time result
> user: 14.26
> sys: 294.44
> elapse: 118.01
> major fault: 3064
> max parallel reclaim tasks: 4
> max consumption time of
> try_to_free_pages(): 12234
>
>
> conclusion
> =========================================
> this patch improve 3 things.
> 1. reduce unnecessary swap
> (see above major fault. about 90% reduced)
> 2. improve throughput performance
> (see above hackbench result. about 90% reduced)
> 3. improve interactive performance.
> (see above max consumption of try_to_free_pages.
> about 80% reduced)
> 4. reduce lock contention.
> (see above sys time. about 80% reduced)
>
>
> Now, we got about 1000% performance improvement of hackbench :)
>
>
>
> foture works
> ==========================================================
> - more discussion with memory controller guys.
Hi,
Yeah this is definitely needed and a nice result.
I'm worried about a) placing a global limit on parallelism, and b)
placing a limit on parallelism at all.
I think it should maybe be a per-zone thing...
What happens if you make it a per-zone mutex, and allow just a single
process to reclaim pages from a given zone at a time? I guess that is
going to slow down throughput a little bit in some cases though...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists