linux-kernel - Re: [PATCH 3/3] sched: limit cpu search and rotate search window for scalability

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d7efc4d3-7c50-752e-a1ae-a164d991dbcc@oracle.com>
Date:   Tue, 24 Apr 2018 17:10:34 -0700
From:   Subhra Mazumdar <subhra.mazumdar@...cle.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, mingo@...hat.com,
        daniel.lezcano@...aro.org, steven.sistare@...cle.com,
        dhaval.giani@...cle.com, rohit.k.jain@...cle.com
Subject: Re: [PATCH 3/3] sched: limit cpu search and rotate search window for
 scalability



On 04/24/2018 05:53 AM, Peter Zijlstra wrote:
> On Mon, Apr 23, 2018 at 05:41:16PM -0700, subhra mazumdar wrote:
>> Lower the lower limit of idle cpu search in select_idle_cpu() and also put
>> an upper limit. This helps in scalability of the search by restricting the
>> search window.
>> @@ -6297,15 +6297,24 @@ static int select_idle_cpu(struct task_struct *p, struct sched_domain *sd, int t
>>   
>>   	if (sched_feat(SIS_PROP)) {
>>   		u64 span_avg = sd->span_weight * avg_idle;
>> -		if (span_avg > 4*avg_cost)
>> +		if (span_avg > 2*avg_cost) {
>>   			nr = div_u64(span_avg, avg_cost);
>> -		else
>> -			nr = 4;
>> +			if (nr > 4)
>> +				nr = 4;
>> +		} else {
>> +			nr = 2;
>> +		}
>>   	}
> Why do you need to put a max on? Why isn't the proportional thing
> working as is? (is the average no good because of big variance or what)
Firstly the choosing of 512 seems arbitrary. Secondly the logic here is
that the enqueuing cpu should search up to time it can get work itself.
Why is that the optimal amount to search?
>
> Again, why do you need to lower the min; what's wrong with 4?
>
> The reason I picked 4 is that many laptops have 4 CPUs and desktops
> really want to avoid queueing if at all possible.
To find the optimum upper and lower limit I varied them over many
combinations. 4 and 2 gave the best results across most benchmarks.