[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <4bd0e615-193d-47c6-8933-a93d75a2f29c@linux.ibm.com>
Date: Tue, 13 Jan 2026 15:00:49 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: Vincent Guittot <vincent.guittot@...aro.org>
Cc: mingo@...nel.org, peterz@...radead.org, linux-kernel@...r.kernel.org,
kprateek.nayak@....com, juri.lelli@...hat.com, vschneid@...hat.com,
tglx@...nel.org, dietmar.eggemann@....com, anna-maria@...utronix.de,
frederic@...nel.org, wangyang.guo@...el.com
Subject: Re: [PATCH v4 3/3] sched/fair: Remove nohz.nr_cpus and use weight of
cpumask instead
Hi Vincent.
>> On system with 480 CPUs, running "hackbench 40 process 10000 loops"
>> (Avg of 3 runs)
>> baseline:
>> 0.81% hackbench [k] nohz_balance_exit_idle
>> 0.21% hackbench [k] nohz_balancer_kick
>> 0.09% swapper [k] nohz_run_idle_balance
>>
>> With patch:
>> 0.35% hackbench [k] nohz_balance_exit_idle
>> 0.09% hackbench [k] nohz_balancer_kick
>> 0.07% swapper [k] nohz_run_idle_balance
>>
>> [Ingo Molnar: scalability analysis changlog]
>> Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
>
> This change makes sense to me but I'm not convinced by patch1.
> You wrote in patch 1 that It doesn't provide any real benefit, what
> are the figures with only patch 3 ?
>
Whole point of patch 1 is, (assuming normal case i.e system wont be 100% busy)
- we read the value and then do time check.
- we bail out if time is not due.
Why bother reading, if time is not due.
I won't expect patch1 to make any major difference. But seemed like right thing to do,
considering that most of the time system won't be 100% busy.
So numbers will be same without patch1.
>> ---
>> kernel/sched/fair.c | 5 +----
>> 1 file changed, 1 insertion(+), 4 deletions(-)
>>
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index c03f963f6216..3408a5beb95b 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -7144,7 +7144,6 @@ static DEFINE_PER_CPU(cpumask_var_t, should_we_balance_tmpmask);
>>
>> static struct {
>> cpumask_var_t idle_cpus_mask;
>> - atomic_t nr_cpus;
>> int has_blocked_load; /* Idle CPUS has blocked load */
>> int needs_update; /* Newly idle CPUs need their next_balance collated */
>> unsigned long next_balance; /* in jiffy units */
>> @@ -12466,7 +12465,7 @@ static void nohz_balancer_kick(struct rq *rq)
>> * None are in tickless mode and hence no need for NOHZ idle load
>> * balancing
>> */
>> - if (unlikely(!atomic_read(&nohz.nr_cpus)))
>> + if (unlikely(cpumask_empty(nohz.idle_cpus_mask)))
>> return;
>>
>> if (rq->nr_running >= 2) {
>> @@ -12579,7 +12578,6 @@ void nohz_balance_exit_idle(struct rq *rq)
>>
>> rq->nohz_tick_stopped = 0;
>> cpumask_clear_cpu(rq->cpu, nohz.idle_cpus_mask);
>> - atomic_dec(&nohz.nr_cpus);
>>
>> set_cpu_sd_state_busy(rq->cpu);
>> }
>> @@ -12637,7 +12635,6 @@ void nohz_balance_enter_idle(int cpu)
>> rq->nohz_tick_stopped = 1;
>>
>> cpumask_set_cpu(cpu, nohz.idle_cpus_mask);
>> - atomic_inc(&nohz.nr_cpus);
>>
>> /*
>> * Ensures that if nohz_idle_balance() fails to observe our
>> --
>> 2.47.3
>>
Powered by blists - more mailing lists