[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <d8729277-f75f-449c-af12-9cd2f200a510@bytedance.com>
Date: Fri, 27 Dec 2024 15:59:22 +0800
From: Chuyi Zhou <zhouchuyi@...edance.com>
To: K Prateek Nayak <kprateek.nayak@....com>, mingo@...hat.com,
peterz@...radead.org, juri.lelli@...hat.com, vincent.guittot@...aro.org,
dietmar.eggemann@....com, rostedt@...dmis.org, bsegall@...gle.com,
mgorman@...e.de, vschneid@...hat.com
Cc: chengming.zhou@...ux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/3] sched/fair: Ensure select housekeeping cpus in
task_numa_find_cpu
Hello,
在 2024/12/27 12:40, K Prateek Nayak 写道:
> Hello Chuyi,
>
> On 12/23/2024 6:28 PM, Chuyi Zhou wrote:
>>
>>
>> 在 2024/12/18 14:21, K Prateek Nayak 写道:
>>> Hello Chuyi,
>>>
>>> On 12/16/2024 5:53 PM, Chuyi Zhou wrote:
>>>> [..snip..]
>>>> @@ -2081,6 +2081,12 @@ numa_type numa_classify(unsigned int
>>>> imbalance_pct,
>>>> return node_fully_busy;
>>>> }
>>>> +static inline bool numa_migrate_test_cpu(struct task_struct *p, int
>>>> cpu)
>>>> +{
>>>> + return cpumask_test_cpu(cpu, p->cpus_ptr) &&
>>>> + housekeeping_cpu(cpu, HK_TYPE_DOMAIN);
>>>> +}
>>>> +
>>>> #ifdef CONFIG_SCHED_SMT
>>>> /* Forward declarations of select_idle_sibling helpers */
>>>> static inline bool test_idle_cores(int cpu);
>>>> @@ -2168,7 +2174,7 @@ static void task_numa_assign(struct
>>>> task_numa_env *env,
>>>> /* Find alternative idle CPU. */
>>>> for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid),
>>>> start + 1) {
>>>
>>> Can we just do:
>>>
>>> for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid),
>>> housekeeping_cpumask(HK_TYPE_DOMAIN)) {
>>> ...
>>> }
>>>
>>> and avoid adding numa_migrate_test_cpu(). Thoughts?
>>
>> Make sense, but now there doesn't seem to be an API like
>> for_each_cpu_wrap_and().
>>
>> Do you think the following is better?
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 855df103f4dd..4792ef672738 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -2167,9 +2167,9 @@ static void task_numa_assign(struct
>> task_numa_env *env,
>> int start = env->dst_cpu;
>>
>> /* Find alternative idle CPU. */
>> - for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid),
>> start + 1) {
>> + for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid),
>> housekeeping_cpumask(HK_TYPE_DOMAIN)) {
>> if (cpu == env->best_cpu || !idle_cpu(cpu) ||
>
> "start" is set to "env->dst_cpu" is already taken care here with the
> first comparison.
>
>> - !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
>> + cpu == start || !cpumask_test_cpu(cpu,
>> env->p->cpus_ptr)) {
>> continue;
>> }
>>
>
> I think the for_each_cpu_wrap() was used to reduce contention for xchg
> operation below. Perhaps we can have a per-cpu temporary mask (like
> load_balance_mask) if we want to reduce the xchg contention and break
> this into cpumask_and() + for_each_cpu_wrap() steps. I'm not sure if
> any of the existing masks (load_balance_mask, select_rq_mask,
> should_we_balance_tmpmask) can be safely reused. Otherwise, perhaps we
> can make a case for for_each_cpu_and_wrap() with this use case.
>
for_each_cpu_and_wrap() is a good idea, but it might be slightly
off-topic for this subject. Perhaps we should stick with this
implementation for now and see what others think about v2.
Thanks.
Powered by blists - more mailing lists