lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z7RhNmLpOb7SLImW@jlelli-thinkpadt14gen4.remote.csb>
Date: Tue, 18 Feb 2025 11:30:14 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Christian Loehle <christian.loehle@....com>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Thierry Reding <treding@...dia.com>,
	Waiman Long <longman@...hat.com>, Tejun Heo <tj@...nel.org>,
	Johannes Weiner <hannes@...xchg.org>,
	Michal Koutny <mkoutny@...e.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Phil Auld <pauld@...hat.com>, Qais Yousef <qyousef@...alina.io>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"Joel Fernandes (Google)" <joel@...lfernandes.org>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Aashish Sharma <shraash@...gle.com>,
	Shin Kawamura <kawasin@...gle.com>,
	Vineeth Remanan Pillai <vineeth@...byteword.org>,
	linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier
 for hotplug

On 18/02/25 10:58, Juri Lelli wrote:
> Hi!
> 
> On 17/02/25 17:08, Juri Lelli wrote:
> > On 14/02/25 10:05, Jon Hunter wrote:
> 
> ...
> 
> > At this point I believe you triggered suspend.
> > 
> > > [   57.290150] Freezing remaining freezable tasks completed (elapsed 0.001 seconds)
> > > [   57.335619] tegra-xusb 3530000.usb: Firmware timestamp: 2020-07-06 13:39:28 UTC
> > > [   57.353364] dwc-eth-dwmac 2490000.ethernet eth0: Link is Down
> > > [   57.397022] Disabling non-boot CPUs ...
> > 
> > Offlining CPU5.
> > 
> > > [   57.400904] dl_bw_manage: cpu=5 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4 type=DYN span=0,3-5
> > > [   57.400949] CPU0 attaching NULL sched-domain.
> > > [   57.415298] span=1-2
> > > [   57.417483] __dl_sub: cpus=3 tsk_bw=52428 total_bw=157284 span=0,3-5 type=DYN
> > > [   57.417487] __dl_server_detach_root: cpu=0 rd_span=0,3-5 total_bw=157284
> > > [   57.417496] rq_attach_root: cpu=0 old_span=NULL new_span=1-2
> > > [   57.417501] __dl_add: cpus=3 tsk_bw=52428 total_bw=157284 span=0-2 type=DEF
> > > [   57.417504] __dl_server_attach_root: cpu=0 rd_span=0-2 total_bw=157284
> > > [   57.417507] CPU3 attaching NULL sched-domain.
> > > [   57.454804] span=0-2
> > > [   57.456987] __dl_sub: cpus=2 tsk_bw=52428 total_bw=104856 span=3-5 type=DYN
> > > [   57.456990] __dl_server_detach_root: cpu=3 rd_span=3-5 total_bw=104856
> > > [   57.456998] rq_attach_root: cpu=3 old_span=NULL new_span=0-2
> > > [   57.457000] __dl_add: cpus=4 tsk_bw=52428 total_bw=209712 span=0-3 type=DEF
> > > [   57.457003] __dl_server_attach_root: cpu=3 rd_span=0-3 total_bw=209712
> > > [   57.457006] CPU4 attaching NULL sched-domain.
> > > [   57.493964] span=0-3
> > > [   57.496152] __dl_sub: cpus=1 tsk_bw=52428 total_bw=52428 span=4-5 type=DYN
> > > [   57.496156] __dl_server_detach_root: cpu=4 rd_span=4-5 total_bw=52428
> > > [   57.496162] rq_attach_root: cpu=4 old_span=NULL new_span=0-3
> > > [   57.496165] __dl_add: cpus=5 tsk_bw=52428 total_bw=262140 span=0-4 type=DEF
> > > [   57.496168] __dl_server_attach_root: cpu=4 rd_span=0-4 total_bw=262140
> > > [   57.496171] CPU5 attaching NULL sched-domain.
> > > [   57.532952] span=0-4
> > > [   57.535143] rq_attach_root: cpu=5 old_span= new_span=0-4
> > > [   57.535147] __dl_add: cpus=5 tsk_bw=52428 total_bw=314568 span=0-5 type=DEF
> > 
> > Maybe we shouldn't add the dl_server contribution of a CPU that is going
> > to be offline.
> 
> I tried to implement this idea and ended up with the following. As usual
> also pushed it to the branch on github. Could you please update and
> re-test?

And now for the actual change

---
 kernel/sched/topology.c | 27 +++++++++++++++------------
 1 file changed, 15 insertions(+), 12 deletions(-)

diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 8830acb4f1b2..c6a140d8d851 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -497,12 +497,14 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
 	if (rq->rd) {
 		old_rd = rq->rd;
 
-		if (rq->fair_server.dl_server)
-			__dl_server_detach_root(&rq->fair_server, rq);
-
-		if (cpumask_test_cpu(rq->cpu, old_rd->online))
+		if (cpumask_test_cpu(rq->cpu, old_rd->online)) {
 			set_rq_offline(rq);
 
+			if (rq->fair_server.dl_server)
+				__dl_server_detach_root(&rq->fair_server, rq);
+		}
+
+
 		cpumask_clear_cpu(rq->cpu, old_rd->span);
 
 		/*
@@ -529,16 +531,17 @@ void rq_attach_root(struct rq *rq, struct root_domain *rd)
 	}
 
 	cpumask_set_cpu(rq->cpu, rd->span);
-	if (cpumask_test_cpu(rq->cpu, cpu_active_mask))
+	if (cpumask_test_cpu(rq->cpu, cpu_active_mask)) {
 		set_rq_online(rq);
 
-	/*
-	 * Because the rq is not a task, dl_add_task_root_domain() did not
-	 * move the fair server bw to the rd if it already started.
-	 * Add it now.
-	 */
-	if (rq->fair_server.dl_server)
-		__dl_server_attach_root(&rq->fair_server, rq);
+		/*
+		 * Because the rq is not a task, dl_add_task_root_domain() did not
+		 * move the fair server bw to the rd if it already started.
+		 * Add it now.
+		 */
+		if (rq->fair_server.dl_server)
+			__dl_server_attach_root(&rq->fair_server, rq);
+	}
 
 	rq_unlock_irqrestore(rq, &rf);


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ