linux-kernel - Re: [PATCH 3/3] sched/fair: Ensure select housekeeping cpus in task_numa_find

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d8729277-f75f-449c-af12-9cd2f200a510@bytedance.com>
Date: Fri, 27 Dec 2024 15:59:22 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>, mingo@...hat.com,
 peterz@...radead.org, juri.lelli@...hat.com, vincent.guittot@...aro.org,
 dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
 mgorman@...e.de, vschneid@...hat.com
Cc: chengming.zhou@...ux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] sched/fair: Ensure select housekeeping cpus in
 task_numa_find_cpu

Hello,

在 2024/12/27 12:40, K Prateek Nayak 写道:
> Hello Chuyi,
> 
> On 12/23/2024 6:28 PM, Chuyi Zhou wrote:
>>
>>
>> 在 2024/12/18 14:21, K Prateek Nayak 写道:
>>> Hello Chuyi,
>>>
>>> On 12/16/2024 5:53 PM, Chuyi Zhou wrote:
>>>> [..snip..]
>>>> @@ -2081,6 +2081,12 @@ numa_type numa_classify(unsigned int 
>>>> imbalance_pct,
>>>>       return node_fully_busy;
>>>>   }
>>>> +static inline bool numa_migrate_test_cpu(struct task_struct *p, int 
>>>> cpu)
>>>> +{
>>>> +    return cpumask_test_cpu(cpu, p->cpus_ptr) &&
>>>> +            housekeeping_cpu(cpu, HK_TYPE_DOMAIN);
>>>> +}
>>>> +
>>>>   #ifdef CONFIG_SCHED_SMT
>>>>   /* Forward declarations of select_idle_sibling helpers */
>>>>   static inline bool test_idle_cores(int cpu);
>>>> @@ -2168,7 +2174,7 @@ static void task_numa_assign(struct 
>>>> task_numa_env *env,
>>>>           /* Find alternative idle CPU. */
>>>>           for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), 
>>>> start + 1) {
>>>
>>> Can we just do:
>>>
>>>      for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), 
>>> housekeeping_cpumask(HK_TYPE_DOMAIN)) {
>>>          ...
>>>      }
>>>
>>> and avoid adding numa_migrate_test_cpu(). Thoughts?
>>
>> Make sense, but now there doesn't seem to be an API like 
>> for_each_cpu_wrap_and().
>>
>> Do you think the following is better?
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 855df103f4dd..4792ef672738 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -2167,9 +2167,9 @@ static void task_numa_assign(struct 
>> task_numa_env *env,
>>                  int start = env->dst_cpu;
>>
>>                  /* Find alternative idle CPU. */
>> -               for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), 
>> start + 1) {
>> +               for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), 
>> housekeeping_cpumask(HK_TYPE_DOMAIN)) {
>>                          if (cpu == env->best_cpu || !idle_cpu(cpu) ||
> 
> "start" is set to "env->dst_cpu" is already taken care here with the
> first comparison.
> 
>> -                           !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
>> +                               cpu == start || !cpumask_test_cpu(cpu, 
>> env->p->cpus_ptr)) {
>>                                  continue;
>>                          }
>>
> 
> I think the for_each_cpu_wrap() was used to reduce contention for xchg
> operation below. Perhaps we can have a per-cpu temporary mask (like
> load_balance_mask) if we want to reduce the xchg contention and break
> this into cpumask_and() + for_each_cpu_wrap() steps. I'm not sure if
> any of the existing masks (load_balance_mask, select_rq_mask,
> should_we_balance_tmpmask) can be safely reused. Otherwise, perhaps we
> can make a case for for_each_cpu_and_wrap() with this use case.
> 


for_each_cpu_and_wrap() is a good idea, but it might be slightly 
off-topic for this subject. Perhaps we should stick with this 
implementation for now and see what others think about v2.


Thanks.