[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b8f5837a-2112-4bca-b99c-98ca41d3ec66@amd.com>
Date: Fri, 27 Dec 2024 10:10:49 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Chuyi Zhou <zhouchuyi@...edance.com>, <mingo@...hat.com>,
<peterz@...radead.org>, <juri.lelli@...hat.com>,
<vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
<rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
<vschneid@...hat.com>
CC: <chengming.zhou@...ux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 3/3] sched/fair: Ensure select housekeeping cpus in
task_numa_find_cpu
Hello Chuyi,
On 12/23/2024 6:28 PM, Chuyi Zhou wrote:
>
>
> 在 2024/12/18 14:21, K Prateek Nayak 写道:
>> Hello Chuyi,
>>
>> On 12/16/2024 5:53 PM, Chuyi Zhou wrote:
>>> [..snip..]
>>> @@ -2081,6 +2081,12 @@ numa_type numa_classify(unsigned int imbalance_pct,
>>> return node_fully_busy;
>>> }
>>> +static inline bool numa_migrate_test_cpu(struct task_struct *p, int cpu)
>>> +{
>>> + return cpumask_test_cpu(cpu, p->cpus_ptr) &&
>>> + housekeeping_cpu(cpu, HK_TYPE_DOMAIN);
>>> +}
>>> +
>>> #ifdef CONFIG_SCHED_SMT
>>> /* Forward declarations of select_idle_sibling helpers */
>>> static inline bool test_idle_cores(int cpu);
>>> @@ -2168,7 +2174,7 @@ static void task_numa_assign(struct task_numa_env *env,
>>> /* Find alternative idle CPU. */
>>> for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), start + 1) {
>>
>> Can we just do:
>>
>> for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), housekeeping_cpumask(HK_TYPE_DOMAIN)) {
>> ...
>> }
>>
>> and avoid adding numa_migrate_test_cpu(). Thoughts?
>
> Make sense, but now there doesn't seem to be an API like for_each_cpu_wrap_and().
>
> Do you think the following is better?
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 855df103f4dd..4792ef672738 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2167,9 +2167,9 @@ static void task_numa_assign(struct task_numa_env *env,
> int start = env->dst_cpu;
>
> /* Find alternative idle CPU. */
> - for_each_cpu_wrap(cpu, cpumask_of_node(env->dst_nid), start + 1) {
> + for_each_cpu_and(cpu, cpumask_of_node(env->dst_nid), housekeeping_cpumask(HK_TYPE_DOMAIN)) {
> if (cpu == env->best_cpu || !idle_cpu(cpu) ||
"start" is set to "env->dst_cpu" is already taken care here with the
first comparison.
> - !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
> + cpu == start || !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
> continue;
> }
>
I think the for_each_cpu_wrap() was used to reduce contention for xchg
operation below. Perhaps we can have a per-cpu temporary mask (like
load_balance_mask) if we want to reduce the xchg contention and break
this into cpumask_and() + for_each_cpu_wrap() steps. I'm not sure if
any of the existing masks (load_balance_mask, select_rq_mask,
should_we_balance_tmpmask) can be safely reused. Otherwise, perhaps we
can make a case for for_each_cpu_and_wrap() with this use case.
>
> Thanks.
>
>
>>
>>> if (cpu == env->best_cpu || !idle_cpu(cpu) ||
>>> - !cpumask_test_cpu(cpu, env->p->cpus_ptr)) {
>>> + !numa_migrate_test_cpu(env->p, cpu)) {
>>> continue;
>>> }
>>> @@ -2480,7 +2486,7 @@ static void task_numa_find_cpu(struct task_numa_env *env,
>>> for_each_cpu(cpu, cpumask_of_node(env->dst_nid)) {
>>
>> Same modifications can be made for this outer loop.
>>
>
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists