linux-kernel - Re: wake_wide mechanism clarification

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1501521826.5348.12.camel@gmail.com>
Date:   Mon, 31 Jul 2017 19:23:46 +0200
From:   Mike Galbraith <umgwanakikbuti@...il.com>
To:     Josef Bacik <josef@...icpanda.com>
Cc:     Joel Fernandes <joelaf@...gle.com>,
        Peter Zijlstra <peterz@...radead.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Juri Lelli <Juri.Lelli@....com>,
        Dietmar Eggemann <dietmar.eggemann@....com>,
        Patrick Bellasi <patrick.bellasi@....com>,
        Brendan Jackman <brendan.jackman@....com>,
        Chris Redpath <Chris.Redpath@....com>,
        Michael Wang <wangyun@...ux.vnet.ibm.com>,
        Matt Fleming <matt@...eblueprint.co.uk>
Subject: Re: wake_wide mechanism clarification

On Mon, 2017-07-31 at 14:48 +0000, Josef Bacik wrote:
> On Mon, Jul 31, 2017 at 03:42:25PM +0200, Mike Galbraith wrote:
> > On Mon, 2017-07-31 at 12:21 +0000, Josef Bacik wrote:
> > > 
> > > I've been working in this area recently because of a cpu imbalance problem.
> > > Wake_wide() definitely makes it so we're waking affine way too often, but I
> > > think messing with wake_waide to solve that problem is the wrong solution.  This
> > > is just a heuristic to see if we should wake affine, the simpler the better.  I
> > > solved the problem of waking affine too often like this
> > > 
> > > https://marc.info/?l=linux-kernel&m=150003849602535&w=2
> > 
> > Wait a minute, that's not quite fair :)  Wake_wide() can't be blamed
> > for causing too frequent affine wakeups when what it does is filter
> > some out.  While it may not reject aggressively enough for you (why you
> > bent it up to be very aggressive), seems the problem from your loads
> > POV is the scheduler generally being too eager to bounce.
> >
> 
> Yeah sorry, I hate this stuff because it's so hard to talk about without mixing
> up different ideas.  I should say the scheduler in general prefers to wake
> affine super hard, and wake_wide() is conservative in it's filtering of this
> behavior.  The rest still holds true, I think tinkering with it is just hard and
> the wrong place to do it, it's a good first step, and we can be smarter further
> down.

Yeah, it's hard, and yeah, bottom line remains unchanged.

> > I've also played with rate limiting migration per task, but it had
> > negative effects too: when idle/periodic balance pulls buddies apart,
> > rate limiting inhibits them quickly finding each other again, making
> > undoing all that hard load balancer work a throughput win.  Sigh.
> > 
> 
> That's why I did the HZ thing, we don't touch the task for HZ to let things
> settle out, and then allow affine wakeups after that.

I kinda like the way you did it better than what I tried, but until a
means exists to _target_ the win, it's gonna be rob Peter to pay Paul,
swap rolls, repeat endlessly.

	-Mike