[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAOtvUMddUAATZcU_5jLgY10ocsHNnOO2GC2c4ecYO9KGt-U7VQ@mail.gmail.com>
Date: Mon, 26 Sep 2011 09:47:10 +0300
From: Gilad Ben-Yossef <gilad@...yossef.com>
To: Shaohua Li <shaohua.li@...el.com>
Cc: "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Frederic Weisbecker <fweisbec@...il.com>,
Russell King <linux@....linux.org.uk>,
Chris Metcalf <cmetcalf@...era.com>,
"linux-mm@...ck.org" <linux-mm@...ck.org>,
Christoph Lameter <cl@...ux-foundation.org>,
Pekka Enberg <penberg@...nel.org>,
Matt Mackall <mpm@...enic.com>
Subject: Re: [PATCH 4/5] mm: Only IPI CPUs to drain local pages if they exist
Hi Li,
Thank you for the feedback!
On Mon, Sep 26, 2011 at 4:52 AM, Shaohua Li <shaohua.li@...el.com> wrote:
> On Sun, 2011-09-25 at 16:54 +0800, Gilad Ben-Yossef wrote:
>> Use a cpumask to track CPUs with per-cpu pages in any zone
>> and only send an IPI requesting CPUs to drain these pages
>> to the buddy allocator if they actually have pages.
> Did you have evaluation why the fine-grained ipi is required? I suppose
> every CPU has local pages here.
I have given it a lot of though and I believe It's a question of work
load - in a "classic" symmetric work load on a small SMP system I
would indeed expect each CPU to have a per cpu pages cache in some
zone. However, we are seeing more and more push towards massively
multi core systems and we add support for using them (e.g. cpusets,
Frederic's dynamic tick task patch set etc.). For these work loads,
things can be different:
In a system where you have many core (or hardware threads) and you
dedicate processors to run a singe CPU bound task that performs
virtually no system calls (quite typical for some high performance
computing set ups), you can very well have situations where the per
cpu released page is empty on many processors, since the working set
per cpu rarely changes, so there was now release since the last drain.
Or just consider a multicore machine where a lot of processors are
simply idle with no activity (and we now have cores with 8 cores / 128
hw threads in a single package) - again, no per CPU local page cache
since there was 0 activity since the last drain, but the IPI will be
yanking cores out of low power states to do the check.
I do not know if these scenarios warrant the additional overhead,
certainly not in all situations. Maybe the right thing is to make it a
config option dependent. As I stated in the patch description, that is
one of the thing I'm interested in feedback on.
Thanks,
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@...yossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com
"I've seen things you people wouldn't believe. Goto statements used to
implement co-routines. I watched C structures being stored in
registers. All those moments will be lost in time... like tears in
rain... Time to die. "
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists