[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <13e885b6-ce95-b39b-9530-d5fdced8d4c3@amd.com>
Date: Thu, 24 Aug 2017 08:14:55 +0700
From: Suravee Suthikulpanit <Suravee.Suthikulpanit@....com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: linux-kernel@...r.kernel.org, mingo@...hat.com, bp@...e.de
Subject: Re: [RFC PATCH] sched/topology: Introduce NUMA identity node sched
domain
Hi Peter,
On 8/14/17 14:44, Suravee Suthikulpanit wrote:
>
>
> On 8/11/17 16:15, Peter Zijlstra wrote:
>> On Fri, Aug 11, 2017 at 12:58:22PM +0700, Suravee Suthikulpanit wrote:
>>>
>>>
>>> On 8/11/17 11:57, Suravee Suthikulpanit wrote:
>>>>
>>>>>> [...]
>>>>>> @@ -1445,9 +1448,24 @@ void sched_init_numa(void)
>>>>>> tl[i] = sched_domain_topology[i];
>>>>>>
>>>>>> /*
>>>>>> + * Ignore the NUMA identity level if it has the same cpumask
>>>>>> + * as previous level. This is the case for:
>>>>>> + * - System with last-level-cache (MC) sched domain span a NUMA node.
>>>>>> + * - System with DIE sched domain span a NUMA node.
>>>>>> + *
>>>>>> + * Assume all NUMA nodes are identical, so only check node 0.
>>>>>> + */
>>>>>> + if (!cpumask_equal(sched_domains_numa_masks[0][0], tl[i-1].mask(0)))
>>>>>> + tl[i++] = (struct sched_domain_topology_level){
>>>>>> + .mask = sd_numa_mask,
>>>>>> + .numa_level = 0,
>>>>>> + SD_INIT_NAME(NODE)
>>>>>> + };
>>>>>
>>>>> So what you've forgotten to mention is that for those systems where the
>>>>> LLC == NODE this now superfluous level gets removed by the degenerate
>>>>> code. Have you verified that does the right thing?
>>>>
>>>> Let me check with that one and get back.
>>>
>>> Actually, it is not removed by the degenerate code. That is what this logic
>>> is for. It checks for LCC == NODE or DIE == NODE before setting up the NODE
>>> sched level. I can update the comment. This has also been tested on system
>>> w/ LLC == NODE.
>>
>> Why does the degenerate code fail to remove things?
>>
>
> Sorry for confusion. Actually, the degenerate code does remove the duplicate
> NODE sched-domain.
>
> The logic above is taking a different approach. Instead of depending on the
> degenerate code during cpu_attach_domain() at a later time, it would exclude the
> NODE sched-domain during sched_init_numa(). The difference is, without
> !cpumask_equal(), now the MC sched-domain would have the SD_PREFER_SIBLING flag
> set by the degenerate code since the flag got transferred down from the NODE to
> MC sched-domain. Would this be the preferred behavior for MC sched-domain?
>
> Regards,
> Suravee
Any feedback on this part?
Thanks,
Suravee
Powered by blists - more mailing lists