lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20250422194853.1636334-1-opendmb@gmail.com>
Date: Tue, 22 Apr 2025 12:48:53 -0700
From: Doug Berger <opendmb@...il.com>
To: Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>
Cc: Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Florian Fainelli <florian.fainelli@...adcom.com>,
	linux-kernel@...r.kernel.org,
	Doug Berger <opendmb@...il.com>
Subject: [PATCH] sched/topology: clear freecpu bit on detach

There is a hazard in the deadline scheduler where an offlined CPU
can have its free_cpus bit left set in the def_root_domain when
the schedutil cpufreq governor is used. This can allow a deadline
thread to be pushed to the runqueue of a powered down CPU which
breaks scheduling. The details can be found here:
https://lore.kernel.org/lkml/20250110233010.2339521-1-opendmb@gmail.com

The free_cpus mask is expected to be cleared by set_rq_offline();
however, the hazard occurs before the root domain is made online
during CPU hotplug so that function is not invoked for the CPU
that is being made active.

This commit works around the issue by ensuring the free_cpus bit
for a CPU is always cleared when the CPU is removed from a
root_domain. This likely makes the call of cpudl_clear_freecpu()
in rq_offline_dl() fully redundant, but I have not removed it
here because I am not certain of all flows.

It seems likely that a better solution is possible from someone
more familiar with the scheduler implementation, but this
approach is minimally invasive from someone who is not.

Signed-off-by: Doug Berger <opendmb@...il.com>
---
 kernel/sched/topology.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index a2a38e1b6f18..c10c5385031f 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -496,6 +496,7 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
 			set_rq_offline(rq);
 
 		cpumask_clear_cpu(rq->cpu, old_rd->span);
+		cpudl_clear_freecpu(&old_rd->cpudl, rq->cpu);
 
 		/*
 		 * If we don't want to free the old_rd yet then
-- 
2.34.1


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ