[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20180213144326.GO25201@hirez.programming.kicks-ass.net>
Date: Tue, 13 Feb 2018 15:43:26 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Mel Gorman <mgorman@...hsingularity.net>
Cc: Ingo Molnar <mingo@...nel.org>, Mike Galbraith <efault@....de>,
Matt Fleming <matt@...eblueprint.co.uk>,
Giovanni Gherdovich <ggherdovich@...e.cz>,
LKML <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 6/6] sched/numa: Delay retrying placement for automatic
NUMA balance after wake_affine
On Tue, Feb 13, 2018 at 02:18:12PM +0000, Mel Gorman wrote:
> On Tue, Feb 13, 2018 at 03:01:37PM +0100, Peter Zijlstra wrote:
> > On Tue, Feb 13, 2018 at 01:37:30PM +0000, Mel Gorman wrote:
> > > +static void
> > > +update_wa_numa_placement(struct task_struct *p, int prev_cpu, int target)
> > > +{
> > > + unsigned long interval;
> > > +
> > > + if (!static_branch_likely(&sched_numa_balancing))
> > > + return;
> > > +
> > > + /* If balancing has no preference then continue gathering data */
> > > + if (p->numa_preferred_nid == -1)
> > > + return;
> > > +
> > > + /*
> > > + * If the wakeup is not affecting locality then it is neutral from
> > > + * the perspective of NUMA balacing so continue gathering data.
> > > + */
> > > + if (cpus_share_cache(prev_cpu, target))
> > > + return;
> >
> > Dang, I wanted to mention this before, but it slipped my mind. The
> > comment and code don't match.
> >
> > Did you want to write:
> >
> > if (cpu_to_node(prev_cpu) == cpu_to_node(target))
> > return;
> >
>
> Well, it was deliberate. While it's possible to be on the same memory
> node and not sharing cache, the scheduler typically is more concerned with
> the LLC than NUMA per-se. If they share LLC, then I also assume that they
> share memory locality.
True, but the remaining code only has effect for numa balance, which is
concerned with nodes. So I don't see the point of using something
potentially smaller.
Suppose someone did hardware where a node has 2 cache clusters, then
we'd still set a wake_affine back-off for numa-balance, even though it
remains on the same node.
How would that be useful?
Powered by blists - more mailing lists