[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251201183146.74443-1-sshegde@linux.ibm.com>
Date: Tue, 2 Dec 2025 00:01:42 +0530
From: Shrikanth Hegde <sshegde@...ux.ibm.com>
To: mingo@...nel.org, peterz@...radead.org, vincent.guittot@...aro.org,
linux-kernel@...r.kernel.org, kprateek.nayak@....com
Cc: sshegde@...ux.ibm.com, dietmar.eggemann@....com, vschneid@...hat.com,
rostedt@...dmis.org, tglx@...utronix.de, tim.c.chen@...ux.intel.com
Subject: [PATCH 0/4] sched/fair: improve nohz fields for large systems
It was noted when running on large systems nohz.nr_cpus cacheline was
bouncing quite often. There is atomic inc/dec and read happening on many
CPUs at a time and it is possible for this line to bounce often.
Gist of the series is to get rid of nr_cpus, instead use the cpumask
which is always updated alongside with it. Functionally it should serve
the same purpose. At worst, one might miss an idle load balance
happening due to race. Looking at comments, it might happen even today.
Other patches are minor ones. there are couple of time checks to bail
out. Check the variables after the time checks to avoid cache references
to it.
There is a series which aims to solve contention by moving to LLC.
https://lore.kernel.org/all/20250904041516.3046-1-kprateek.nayak@amd.com/
Maybe these bits are useful for that too. We could discuss further at
LPC.
Ran "hackbench 100 process 5000 loops" and collected perf cycles and
selected top nohz functions. Benchmark numbers don't change by much.
Will ask our performance team to do the numbers with the series.
baseline: tip sched/core at 3eb593560146
1.01% [k] nohz_balance_exit_idle
0.31% [k] nohz_balancer_kick
0.05% [k] nohz_balance_enter_idle
With series:
0.45% [k] nohz_balance_exit_idle
0.18% [k] nohz_balancer_kick
0.01% [k] nohz_balance_enter_idle
Shrikanth Hegde (4):
sched/fair: Move checking for nohz cpus after time check
sched/fair: Change likelyhood of nohz nr_cpus check
sched/fair: Check for blocked task after time check
sched/fair: Remove atomic nr_cpus and use cpumask instead
kernel/sched/fair.c | 20 ++++++++------------
1 file changed, 8 insertions(+), 12 deletions(-)
--
2.43.0
Powered by blists - more mailing lists