[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251201183146.74443-5-sshegde@linux.ibm.com>
Date: Tue, 2 Dec 2025 00:01:46 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: mingo@...nel.org, peterz@...radead.org, vincent.guittot@...aro.org,
linux-kernel@...r.kernel.org, kprateek.nayak@....com
Cc: sshegde@...ux.ibm.com, dietmar.eggemann@....com, vschneid@...hat.com,
rostedt@...dmis.org, tglx@...utronix.de, tim.c.chen@...ux.intel.com
Subject: [PATCH 4/4] sched/fair: Remove atomic nr_cpus and use cpumask instead
nohz_balance_enter_idle:
cpumask_set_cpu(cpu, nohz.idle_cpus_mask)
atomic_inc(&nohz.nr_cpus)
nohz_balance_exit_idle:
cpumask_clear_cpu(rq->cpu, nohz.idle_cpus_mask)
atomic_dec(&nohz.nr_cpus)
kick_ilb:
if (likely(!atomic_read(&nohz.nr_cpus)))
return;
So, idle_cpus_mask contains the same information. Instead of doing
costly atomic in large systems, its better to check if cpumask is empty
or not to make the same decision to trigger idle load balance.
There might be race between cpumask_empty check and set of cpumask in
the remote CPUs. In such case at next tick idle load balance will be
triggered. Race of clearing the bit is not a concern, since _nohz_idle_balance
checks if CPU is idle or not before doing the balance.
cpumask_empty uses ffs. So should not be very costly.
Signed-off-by: Shrikanth Hegde <sshegde@...ux.ibm.com>
---
kernel/sched/fair.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 5534822fd754..3afa79d733dd 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7144,7 +7144,6 @@ static DEFINE_PER_CPU(cpumask_var_t, should_we_balance_tmpmask);
static struct {
cpumask_var_t idle_cpus_mask;
- atomic_t nr_cpus;
int has_blocked; /* Idle CPUS has blocked load */
int needs_update; /* Newly idle CPUs need their next_balance collated */
unsigned long next_balance; /* in jiffy units */
@@ -12450,7 +12449,7 @@ static void nohz_balancer_kick(struct rq *rq)
* None are in tickless mode and hence no need for NOHZ idle load
* balancing, do stats update if its due
*/
- if (unlikely(!atomic_read(&nohz.nr_cpus)))
+ if (unlikely(cpumask_empty(nohz.idle_cpus_mask)))
goto out;
if (rq->nr_running >= 2) {
@@ -12563,7 +12562,6 @@ void nohz_balance_exit_idle(struct rq *rq)
rq->nohz_tick_stopped = 0;
cpumask_clear_cpu(rq->cpu, nohz.idle_cpus_mask);
- atomic_dec(&nohz.nr_cpus);
set_cpu_sd_state_busy(rq->cpu);
}
@@ -12621,7 +12619,6 @@ void nohz_balance_enter_idle(int cpu)
rq->nohz_tick_stopped = 1;
cpumask_set_cpu(cpu, nohz.idle_cpus_mask);
- atomic_inc(&nohz.nr_cpus);
/*
* Ensures that if nohz_idle_balance() fails to observe our
--
2.43.0
Powered by blists - more mailing lists