lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250819025720.14794-1-adamli@os.amperecomputing.com>
Date: Tue, 19 Aug 2025 02:57:20 +0000
From: Adam Li <adamli@...amperecomputing.com>
To: mingo@...hat.com,
	peterz@...radead.org,
	juri.lelli@...hat.com,
	vincent.guittot@...aro.org
Cc: dietmar.eggemann@....com,
	rostedt@...dmis.org,
	bsegall@...gle.com,
	mgorman@...e.de,
	vschneid@...hat.com,
	cl@...ux.com,
	frederic@...nel.org,
	linux-kernel@...r.kernel.org,
	patches@...erecomputing.com,
	Adam Li <adamli@...amperecomputing.com>
Subject: [PATCH] sched/nohz: Fix NOHZ imbalance by adding options for ILB CPU

A qualified CPU to run NOHZ idle load balancing (ILB) has to be:
1) housekeeping CPU in housekeeping_cpumask(HK_TYPE_KERNEL_NOISE)
2) and in nohz.idle_cpus_mask
3) and idle
4) and not current CPU

If most CPUs are in nohz_full CPU list there is few housekeeping CPU left.
In the worst case if all CPUs are in nohz_full only the boot CPU is used
for housekeeping. And the housekeeping CPU is usually busier so it will
be unlikely added to nohz.idle_cpus_mask.

Therefore if there is few housekeeping CPUs, find_new_ilb() may likely
failed to find any CPU to do NOHZ idle load balancing. Some NOHZ CPUs may
stay idle while other CPUs are busy.

This patch adds fallback options when looking for ILB CPU if there is
no CPU meeting above requirements. Then it searches in bellow order:
1) Try looking for the first idle housekeeping CPU
2) Try looking for the first idle CPU in nohz.idle_cpus_mask if no SMT.
3) Select the first housekeeping CPU even if it is busy.

With this patch the NOHZ idle balancing happens more frequently.

Signed-off-by: Adam Li <adamli@...amperecomputing.com>
---
 kernel/sched/fair.c | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index b173a059315c..12bcc3f81f9b 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -12194,19 +12194,45 @@ static inline int on_null_domain(struct rq *rq)
 static inline int find_new_ilb(void)
 {
 	const struct cpumask *hk_mask;
-	int ilb_cpu;
+	struct cpumask ilb_mask;
+	int ilb_cpu, this_cpu = smp_processor_id();
 
 	hk_mask = housekeeping_cpumask(HK_TYPE_KERNEL_NOISE);
 
-	for_each_cpu_and(ilb_cpu, nohz.idle_cpus_mask, hk_mask) {
+	/*
+	 * Look for an idle cpu who is both NOHZ_idle and housekeeping.
+	 * If no such cpu, look for an idle housekeeping cpu.
+	 */
+	if (!cpumask_and(&ilb_mask, nohz.idle_cpus_mask, hk_mask))
+		cpumask_copy(&ilb_mask, hk_mask);
 
-		if (ilb_cpu == smp_processor_id())
+	for_each_cpu(ilb_cpu, &ilb_mask) {
+		if (ilb_cpu == this_cpu)
 			continue;
 
 		if (idle_cpu(ilb_cpu))
 			return ilb_cpu;
 	}
 
+	/*
+	 * If CPU has no SMT, look for an idle NOHZ_idle cpu.
+	 * Run NOHZ ILB may cause jitter on SMT sibling CPU.
+	 */
+	if (!sched_smt_active()) {
+		for_each_cpu(ilb_cpu, nohz.idle_cpus_mask) {
+			if (ilb_cpu == this_cpu)
+				continue;
+
+			if (idle_cpu(ilb_cpu))
+				return ilb_cpu;
+		}
+	}
+
+	/* Select the first housekeeping cpu anyway. */
+	ilb_cpu = cpumask_first(hk_mask);
+	if (ilb_cpu < nr_cpu_ids)
+		return ilb_cpu;
+
 	return -1;
 }
 
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ