[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20170731144806.GA7791@li70-116.members.linode.com>
Date: Mon, 31 Jul 2017 14:48:07 +0000
From: Josef Bacik <josef@...icpanda.com>
To: Mike Galbraith <umgwanakikbuti@...il.com>
Cc: Josef Bacik <josef@...icpanda.com>,
Joel Fernandes <joelaf@...gle.com>,
Peter Zijlstra <peterz@...radead.org>,
LKML <linux-kernel@...r.kernel.org>,
Juri Lelli <Juri.Lelli@....com>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Patrick Bellasi <patrick.bellasi@....com>,
Brendan Jackman <brendan.jackman@....com>,
Chris Redpath <Chris.Redpath@....com>,
Michael Wang <wangyun@...ux.vnet.ibm.com>,
Matt Fleming <matt@...eblueprint.co.uk>
Subject: Re: wake_wide mechanism clarification
On Mon, Jul 31, 2017 at 03:42:25PM +0200, Mike Galbraith wrote:
> On Mon, 2017-07-31 at 12:21 +0000, Josef Bacik wrote:
> >
> > I've been working in this area recently because of a cpu imbalance problem.
> > Wake_wide() definitely makes it so we're waking affine way too often, but I
> > think messing with wake_waide to solve that problem is the wrong solution. This
> > is just a heuristic to see if we should wake affine, the simpler the better. I
> > solved the problem of waking affine too often like this
> >
> > https://marc.info/?l=linux-kernel&m=150003849602535&w=2
>
> Wait a minute, that's not quite fair :) Wake_wide() can't be blamed
> for causing too frequent affine wakeups when what it does is filter
> some out. While it may not reject aggressively enough for you (why you
> bent it up to be very aggressive), seems the problem from your loads
> POV is the scheduler generally being too eager to bounce.
>
Yeah sorry, I hate this stuff because it's so hard to talk about without mixing
up different ideas. I should say the scheduler in general prefers to wake
affine super hard, and wake_wide() is conservative in it's filtering of this
behavior. The rest still holds true, I think tinkering with it is just hard and
the wrong place to do it, it's a good first step, and we can be smarter further
down.
> I've also played with rate limiting migration per task, but it had
> negative effects too: when idle/periodic balance pulls buddies apart,
> rate limiting inhibits them quickly finding each other again, making
> undoing all that hard load balancer work a throughput win. Sigh.
>
That's why I did the HZ thing, we don't touch the task for HZ to let things
settle out, and then allow affine wakeups after that. Now HZ may be an eternity
in scheduler time, but I think its a good middle ground. For our case the box
is loaded constantly, so we basically never want affine wakeups for our app.
For the case where there's spikey behavior we'll return to normal affine wakeups
a short while later.
But from my admittedly limited testing it appears to be a win overall. Thanks,
Josef
Powered by blists - more mailing lists