lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241029225116.3998487-1-joel@joelfernandes.org>
Date: Tue, 29 Oct 2024 22:51:14 +0000
From: "Joel Fernandes (Google)" <joel@...lfernandes.org>
To: linux-kernel@...r.kernel.org,
	Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>,
	Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>
Cc: "Joel Fernandes (Google)" <joel@...lfernandes.org>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Aashish Sharma <shraash@...gle.com>,
	Shin Kawamura <kawasin@...gle.com>,
	Vineeth Remanan Pillai <vineeth@...byteword.org>
Subject: [PATCH] dl_server: Reset DL server params when rd changes

During boot initialization, DL server parameters are initialized using the
default root domain before the proper scheduler domains and root domains
are built. This results in DL server parameters being tied to the default
root domain's bandwidth accounting instead of the actual root domain
assigned to the CPU after scheduler topology initialization.

When secondary CPUs are brought up, the dl_bw_cpus() accounting doesn't
properly track CPUs being added since the DL server was started too early
with the default root domain. Specifically, dl_bw_cpus() is called before
set_cpu_active() during secondary CPU bringup, causing it to not account
for the CPU being brought up in its capacity calculations. This causes
subsequent sysfs parameter updates to fail with -EBUSY due to bandwidth
accounting using the wrong root domain with zeroed total_bw.

This issue also causes under-utilization of system capacity. With the fix,
we see proper capacity initialization and scaling as CPUs come online - the
total system capacity increases from CPU 0 to CPU 1 and continues scaling
up as more CPUs are added (from cap=1024 initially to cap=8192 with 8
CPUs). Without the fix, the capacity initialization was incomplete since
dl_bw_cpus() runs before the CPU is marked active in set_cpu_active(),
leading to CPUs not being properly accounted for in the capacity
calculations.

Fix this by tracking the last root domain used for the DL server and
resetting the server parameters when the root domain changes. This ensures
bandwidth accounting uses the correct, fully initialized root domain after
the scheduler topology is built.

Cc: Steven Rostedt <rostedt@...dmis.org>
Cc: Suleiman Souhlal <suleiman@...gle.com>
Cc: Aashish Sharma <shraash@...gle.com>
Cc: Shin Kawamura <kawasin@...gle.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Cc: Juri Lelli <juri.lelli@...hat.com>
Cc: Vineeth Remanan Pillai <vineeth@...byteword.org>
Signed-off-by: Joel Fernandes (Google) <joel@...lfernandes.org>
---
 include/linux/sched.h   |  1 +
 kernel/sched/deadline.c | 13 +++++++++----
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 1d5cc3e50884..427d1da79d05 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -695,6 +695,7 @@ struct sched_dl_entity {
 	struct rq			*rq;
 	dl_server_has_tasks_f		server_has_tasks;
 	dl_server_pick_f		server_pick_task;
+	struct root_domain		*last_rd;
 
 #ifdef CONFIG_RT_MUTEXES
 	/*
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index d9d5a702f1a6..076b372f28b5 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1627,21 +1627,26 @@ void dl_server_update(struct sched_dl_entity *dl_se, s64 delta_exec)
 void dl_server_start(struct sched_dl_entity *dl_se)
 {
 	struct rq *rq = dl_se->rq;
+	int cpu = cpu_of(rq);
+	struct root_domain *rd = cpu_rq(cpu)->rd;
 
 	/*
 	 * XXX: the apply do not work fine at the init phase for the
 	 * fair server because things are not yet set. We need to improve
 	 * this before getting generic.
 	 */
-	if (!dl_server(dl_se)) {
+	if (!dl_server(dl_se) || dl_se->last_rd != rd) {
 		u64 runtime =  50 * NSEC_PER_MSEC;
 		u64 period = 1000 * NSEC_PER_MSEC;
 
+		dl_se->last_rd = rd;
 		dl_server_apply_params(dl_se, runtime, period, 1);
 
-		dl_se->dl_server = 1;
-		dl_se->dl_defer = 1;
-		setup_new_dl_entity(dl_se);
+		if (!dl_server(dl_se)) {
+			dl_se->dl_server = 1;
+			dl_se->dl_defer = 1;
+			setup_new_dl_entity(dl_se);
+		}
 	}
 
 	if (!dl_se->dl_runtime)
-- 
2.47.0.163.g1226f6d8fa-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ