linux-kernel - Re: [RFC 2/3] sched/topology: fix sched groups on NUMA machines with mesh topology

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <c21ceced-400c-0986-e53f-4c5eea8b23dd@redhat.com>
Date:   Mon, 17 Apr 2017 11:40:59 -0300
From:   Lauro Venancio <lvenanci@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     linux-kernel@...r.kernel.org, lwang@...hat.com, riel@...hat.com,
        Mike Galbraith <efault@....de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...nel.org>
Subject: Re: [RFC 2/3] sched/topology: fix sched groups on NUMA machines with
 mesh topology

On 04/14/2017 01:58 PM, Peter Zijlstra wrote:
> On Fri, Apr 14, 2017 at 01:38:13PM +0200, Peter Zijlstra wrote:
>> On Thu, Apr 13, 2017 at 10:56:08AM -0300, Lauro Ramos Venancio wrote:
>>> This patch constructs the sched groups from each CPU perspective. So, on
>>> a 4 nodes machine with ring topology, while nodes 0 and 2 keep the same
>>> groups as before [(3, 0, 1)(1, 2, 3)], nodes 1 and 3 have new groups
>>> [(0, 1, 2)(2, 3, 0)]. This allows moving tasks between any node 2-hops
>>> apart.
>> Ah,.. so after drawing pictures I see what went wrong; duh :-(
>>
>> An equivalent patch would be (if for_each_cpu_wrap() were exposed):
>>
>> @@ -521,11 +588,11 @@ build_overlap_sched_groups(struct sched_domain *sd, int cpu)
>>  	struct cpumask *covered = sched_domains_tmpmask;
>>  	struct sd_data *sdd = sd->private;
>>  	struct sched_domain *sibling;
>> -	int i;
>> +	int i, wrap;
>>  
>>  	cpumask_clear(covered);
>>  
>> -	for_each_cpu(i, span) {
>> +	for_each_cpu_wrap(i, span, cpu, wrap) {
>>  		struct cpumask *sg_span;
>>  
>>  		if (cpumask_test_cpu(i, covered))
>>
>>
>> We need to start iterating at @cpu, not start at 0 every time.
>>
>>
> OK, please have a look here:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git/log/?h=sched/core

Looks good, but please hold these patches while patch 3 is not applied.
Without it, the sched_group_capacity (sg->sgc) instance is not selected
correctly and we have an important performance regression in all NUMA
machines.

I will continue this discussion in the other thread.