[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAOtvUMdXUjr8ZN+Mv8XuAmMtf0jZ-_r2f_1XJxhpwmJC6orSGw@mail.gmail.com>
Date: Wed, 11 Jan 2012 18:10:25 +0200
From: Gilad Ben-Yossef <gilad@...yossef.com>
To: Milton Miller <miltonm@....com>
Cc: Christoph Lameter <cl@...ux.com>,
Michal Nazarewicz <mina86@...a86.com>,
Mel Gorman <mel@....ul.ie>,
KOSAKI Motohiro <kosaki.motohiro@...fujitsu.com>,
linux-kernel@...r.kernel.org, Chris Metcalf <cmetcalf@...era.com>,
Peter Zijlstra <a.p.zijlstra@...llo.nl>,
Frederic Weisbecker <fweisbec@...il.com>,
Russell King <linux@....linux.org.uk>, linux-mm@...ck.org,
Pekka Enberg <penberg@...nel.org>,
Matt Mackall <mpm@...enic.com>, Rik van Riel <riel@...hat.com>,
Andi Kleen <andi@...stfloor.org>,
Sasha Levin <levinsasha928@...il.com>,
Andrew Morton <akpm@...ux-foundation.org>,
Alexander Viro <viro@...iv.linux.org.uk>,
linux-fsdevel@...r.kernel.org, Avi Kivity <avi@...hat.com>
Subject: Re: [PATCH v6 7/8] mm: only IPI CPUs to drain local pages if they exist
On Wed, Jan 11, 2012 at 9:04 AM, Milton Miller <miltonm@....com> wrote:
>> > > @@ -1097,7 +1105,19 @@ void drain_local_pages(void *arg)
>> > > */
>> > > void drain_all_pages(void)
>> > > {
>> > > - on_each_cpu(drain_local_pages, NULL, 1);
>> > > + int cpu;
>> > > + struct per_cpu_pageset *pcp;
>> > > + struct zone *zone;
>> > > +
>> > > + for_each_online_cpu(cpu)
>> > > + for_each_populated_zone(zone) {
>> > > + pcp = per_cpu_ptr(zone->pageset, cpu);
>> > > + if (pcp->pcp.count)
>> > > + cpumask_set_cpu(cpu, cpus_with_pcps);
>> > > + else
>> > > + cpumask_clear_cpu(cpu, cpus_with_pcps);
>> > > + }
>> > > + on_each_cpu_mask(cpus_with_pcps, drain_local_pages, NULL, 1);
>> >
>
> This will only select cpus that have pcp pages in the last zone,
> not pages in any populated zone.
Oy! you are very right.
>
> Call cpu_mask_clear before the for_each_online_cpu loop, and only
> set cpus in the loop.
>
> Hmm, that means we actually need a lock for using the static mask.
We don't have to clear before the loop or lock. We can do something like this:
int cpu;
struct per_cpu_pageset *pcp;
struct zone *zone;
for_each_online_cpu(cpu) {
bool has_pcps = false;
for_each_populated_zone(zone) {
pcp = per_cpu_ptr(zone->pageset, cpu);
if (pcp->pcp.count) {
has_pcps = true;
break;
}
}
if (has_pcps)
cpumask_set_cpu(cpu, cpus_with_pcps);
else
cpumask_clear_cpu(cpu, cpus_with_pcps);
}
on_each_cpu_mask(cpus_with_pcps, drain_local_pages, NULL, 1);
>
> -- what would happen without any lock?
> [The last to store would likely see all the cpus and send them IPIs
> but the earlier, racing callers would miss some cpus (It will test
> our smp_call_function_many locking!), and would retry before all the
> pages were drained from pcp pools]. The locking should be better
> than racing -- one cpu will peek and pull all per-cpu data to itself,
> then copy the mask to its smp_call_function_many element, then wait
> for all the cpus it found cpus to process their list pulling their
> pcp data back to the owners. Not much sense in the racing cpus
> figighting to bring the per-cpu data to themselves to write to the
> now contended static mask while pulling the zone pcp data from the
> owning cpus that are trying to push to the buddy lists.
>
That is a very good point and I guess you are right that the locking
saves cache bounces and probably also some IPIs.
I am not 100% it is a win though. The lockless approach is simpler,
has zero risk of dead locks. I also fear that any non interruptable lock
(be it spinlock or interruptable lock non mutex) might delay an OOM
in progress.
Having the lock there is not a correctness issue - it is a performance
enhancement. Because this is not a fast path, i would go for simple
and less risk and not have the lock there at all.
> The benefit numbers in the changelog need to be measured again
> after this correction.
>
Yes, of course. Preliminary numbers with the lockless version above
still looks good.
I am guessing though that Mel's patch will make all this moot anyway
since if we're not doing a drain_all in the direct reclaim we're left with
very rare code paths (memory error and memory migration) that call
the drain_all and they wont tend to do that concurrently like the direct
reclaim.
>
>
> Disabling preemption around online loop and the call will prevent
> races with cpus going offline, but we do not that we care as the
> offline notification will cause the notified cpu to drain that pool.
> on_each_cpu_mask disables preemption as part of its processing.
>
> If we document the o-e-c-m behaviour then we should consider putting a
> comment here, or at least put the previous paragraph in the change log.
I noticed that Mel's patch already added get_online_cpus() in drain_all.
>
>> > > }
>> > >
>> > > #ifdef CONFIG_HIBERNATION
>> > > @@ -3601,6 +3621,10 @@ static void setup_zone_pageset(struct zone *zone)
>> > > void __init setup_per_cpu_pageset(void)
>> > > {
>> > > struct zone *zone;
>> > > + int ret;
>> > > +
>> > > + ret = zalloc_cpumask_var(&cpus_with_pcps, GFP_KERNEL);
>> > > + BUG_ON(!ret);
>> > >
>
> Switching to cpumask_t will eliminate this hunk. Even if we decide
> to keep it a cpumask_var_t we don't need to pre-zero it as we set
> the entire mask so alloc_cpumask_var would be sufficient.
>
hmm... then we can make it a static local variable and it doesn't even
make the kernel image really bigger since it's in the BSS. Cool.
Thanks,
Gilad
--
Gilad Ben-Yossef
Chief Coffee Drinker
gilad@...yossef.com
Israel Cell: +972-52-8260388
US Cell: +1-973-8260388
http://benyossef.com
"Unfortunately, cache misses are an equal opportunity pain provider."
-- Mike Galbraith, LKML
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@...r.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Powered by blists - more mailing lists