linux-kernel - Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <1518718056.13961.23.camel@gmx.de>
Date:   Thu, 15 Feb 2018 19:07:36 +0100
From:   Mike Galbraith <efault@....de>
To:     Steven Sistare <steven.sistare@...cle.com>,
        Rohit Jain <rohit.k.jain@...cle.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, joelaf@...gle.com,
        jbacik@...com, riel@...hat.com, juri.lelli@...hat.com,
        dhaval.giani@...cle.com
Subject: Re: [RFC 1/2] sched: reduce migration cost between faster caches
 for idle_balance

On Thu, 2018-02-15 at 11:35 -0500, Steven Sistare wrote:
> On 2/10/2018 1:37 AM, Mike Galbraith wrote:
> > On Fri, 2018-02-09 at 11:08 -0500, Steven Sistare wrote:
> >>>> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
> >>>>  		if (!(sd->flags & SD_LOAD_BALANCE))
> >>>>  			continue;
> >>>>  
> >>>> -		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
> >>>> +		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost +
> >>>> +		    sd->sched_migration_cost) {
> >>>>  			update_next_balance(sd, &next_balance);
> >>>>  			break;
> >>>>  		}
> >>>
> >>> Ditto.
> >>
> >> The old code did not migrate if the expected costs exceeded the expected idle
> >> time.  The new code just adds the sd-specific penalty (essentially loss of cache 
> >> footprint) to the costs.  The for_each_domain loop visit smallest to largest
> >> sd's, hence visiting smallest to largest migration costs (though the tunables do 
> >> not enforce an ordering), and bails at the first sd where the total cost is a lose.
> > 
> > Hrm..
> > 
> > You're now adding a hypothetical cost to the measured cost of running
> > the LB machinery, which implies that the measurement is insufficient,
> > but you still don't say why it is insufficient.  What happens if you
> > don't do that?  I ask, because when I removed the...
> > 
> >    this_rq->avg_idle < sysctl_sched_migration_cost
> > 
> > ...bits to check removal effect for Peter, the original reason for it
> > being added did not re-materialize, making me wonder why you need to
> > make this cutoff more aggressive.
> 
> The current code with sysctl_sched_migration_cost discourages migration
> too much, per our test results.

That's why I asked you what happens if you only whack the _apparently_
(but maybe not) obsolete old throttle, it appeared likely that your win
came from allowing a bit more migration than the simple throttle
allowed, which if true, would obviate the need for anything more.

> Can you provide more details on the sysbench oltp test that motivated you
> to add sysctl_sched_migration_cost to idle_balance, so Rohit can re-test it?

The problem at that time was the cycle overhead of entering that LB
path at high frequency.  Dirt simple.

	-Mike