linux-kernel - Re: [RFC 1/2] sched: reduce migration cost between faster caches for idle

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <e9591680-4278-5344-6876-d16e8f9c53c4@oracle.com>
Date:   Thu, 15 Feb 2018 13:21:46 -0500
From:   Steven Sistare <steven.sistare@...cle.com>
To:     Mike Galbraith <efault@....de>,
        Rohit Jain <rohit.k.jain@...cle.com>,
        linux-kernel@...r.kernel.org
Cc:     peterz@...radead.org, mingo@...hat.com, joelaf@...gle.com,
        jbacik@...com, riel@...hat.com, juri.lelli@...hat.com,
        dhaval.giani@...cle.com
Subject: Re: [RFC 1/2] sched: reduce migration cost between faster caches for
 idle_balance

On 2/15/2018 1:07 PM, Mike Galbraith wrote:
> On Thu, 2018-02-15 at 11:35 -0500, Steven Sistare wrote:
>> On 2/10/2018 1:37 AM, Mike Galbraith wrote:
>>> On Fri, 2018-02-09 at 11:08 -0500, Steven Sistare wrote:
>>>>>> @@ -8804,7 +8803,8 @@ static int idle_balance(struct rq *this_rq, struct rq_flags *rf)
>>>>>>  		if (!(sd->flags & SD_LOAD_BALANCE))
>>>>>>  			continue;
>>>>>>  
>>>>>> -		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost) {
>>>>>> +		if (this_rq->avg_idle < curr_cost + sd->max_newidle_lb_cost +
>>>>>> +		    sd->sched_migration_cost) {
>>>>>>  			update_next_balance(sd, &next_balance);
>>>>>>  			break;
>>>>>>  		}
>>>>>
>>>>> Ditto.
>>>>
>>>> The old code did not migrate if the expected costs exceeded the expected idle
>>>> time.  The new code just adds the sd-specific penalty (essentially loss of cache 
>>>> footprint) to the costs.  The for_each_domain loop visit smallest to largest
>>>> sd's, hence visiting smallest to largest migration costs (though the tunables do 
>>>> not enforce an ordering), and bails at the first sd where the total cost is a lose.
>>>
>>> Hrm..
>>>
>>> You're now adding a hypothetical cost to the measured cost of running
>>> the LB machinery, which implies that the measurement is insufficient,
>>> but you still don't say why it is insufficient.  What happens if you
>>> don't do that?  I ask, because when I removed the...
>>>
>>>    this_rq->avg_idle < sysctl_sched_migration_cost
>>>
>>> ...bits to check removal effect for Peter, the original reason for it
>>> being added did not re-materialize, making me wonder why you need to
>>> make this cutoff more aggressive.
>>
>> The current code with sysctl_sched_migration_cost discourages migration
>> too much, per our test results.
> 
> That's why I asked you what happens if you only whack the _apparently_
> (but maybe not) obsolete old throttle, it appeared likely that your win
> came from allowing a bit more migration than the simple throttle
> allowed, which if true, would obviate the need for anything more.
> 
>> Can you provide more details on the sysbench oltp test that motivated you
>> to add sysctl_sched_migration_cost to idle_balance, so Rohit can re-test it?
> 
> The problem at that time was the cycle overhead of entering that LB
> path at high frequency.  Dirt simple.

I get that. I meant please provide details on test parameters and config if
you remember them.

- Steve