lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z6M5fQB9P1_bDF7A@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 5 Feb 2025 11:12:13 +0100
From: Juri Lelli <juri.lelli@...hat.com>
To: Jon Hunter <jonathanh@...dia.com>
Cc: Thierry Reding <treding@...dia.com>, Waiman Long <longman@...hat.com>,
	Tejun Heo <tj@...nel.org>, Johannes Weiner <hannes@...xchg.org>,
	Michal Koutny <mkoutny@...e.com>, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>,
	Phil Auld <pauld@...hat.com>, Qais Yousef <qyousef@...alina.io>,
	Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	"Joel Fernandes (Google)" <joel@...lfernandes.org>,
	Suleiman Souhlal <suleiman@...gle.com>,
	Aashish Sharma <shraash@...gle.com>,
	Shin Kawamura <kawasin@...gle.com>,
	Vineeth Remanan Pillai <vineeth@...byteword.org>,
	linux-kernel@...r.kernel.org, cgroups@...r.kernel.org,
	"linux-tegra@...r.kernel.org" <linux-tegra@...r.kernel.org>
Subject: Re: [PATCH v2 3/2] sched/deadline: Check bandwidth overflow earlier
 for hotplug

On 05/02/25 07:53, Juri Lelli wrote:
> On 03/02/25 11:01, Jon Hunter wrote:
> > Hi Juri,
> > 
> > On 16/01/2025 15:55, Juri Lelli wrote:
> > > On 16/01/25 13:14, Jon Hunter wrote:
> 
> ...
> 
> > > > [  210.595431] dl_bw_manage: cpu=5 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
> > > > [  210.606269] dl_bw_manage: cpu=4 cap=2048 fair_server_bw=52428 total_bw=157284 dl_bw_cpus=3
> > > > [  210.617281] dl_bw_manage: cpu=3 cap=1024 fair_server_bw=52428 total_bw=104856 dl_bw_cpus=2
> > > > [  210.627205] dl_bw_manage: cpu=2 cap=1024 fair_server_bw=52428 total_bw=262140 dl_bw_cpus=2
> > > > [  210.637752] dl_bw_manage: cpu=1 cap=0 fair_server_bw=52428 total_bw=262140 dl_bw_cpus=1
> > >                                                                            ^
> > > Different than before but still not what I expected. Looks like there
> > > are conditions/path I currently cannot replicate on my setup, so more
> > > thinking. Unfortunately I will be out traveling next week, so this
> > > might required a bit of time.
> > 
> > 
> > I see that this is now in the mainline and our board is still failing to
> > suspend. Let me know if there is anything else you need me to test.
> 
> Ah, can you actually add 'sched_verbose' and to your kernel cmdline? It
> should print our additional debug info on the console when domains get
> reconfigured by hotplug/suspends, e.g.
> 
>  dl_bw_manage: cpu=3 cap=3072 fair_server_bw=52428 total_bw=209712 dl_bw_cpus=4
>  CPU0 attaching NULL sched-domain.
>  CPU3 attaching NULL sched-domain.
>  CPU4 attaching NULL sched-domain.
>  CPU5 attaching NULL sched-domain.
>  CPU0 attaching sched-domain(s):
>   domain-0: span=0,4-5 level=MC
>    groups: 0:{ span=0 cap=766 }, 4:{ span=4 cap=908 }, 5:{ span=5 cap=989 }
>  CPU4 attaching sched-domain(s):
>   domain-0: span=0,4-5 level=MC
>    groups: 4:{ span=4 cap=908 }, 5:{ span=5 cap=989 }, 0:{ span=0 cap=766 }
>  CPU5 attaching sched-domain(s):
>   domain-0: span=0,4-5 level=MC
>    groups: 5:{ span=5 cap=989 }, 0:{ span=0 cap=766 }, 4:{ span=4 cap=908 }
>  root domain span: 0,4-5
>  rd 0,4-5: Checking EAS, CPUs do not have asymmetric capacities
>  psci: CPU3 killed (polled 0 ms)
> 
> Can you please share this information as well if you are able to collect
> it (while still running with my last proposed fix)?

Also, if you don't mind, add the following on top of the existing
changes.

Just to be sure we don't get out of sync, I pushed current set to

https://github.com/jlelli/linux.git experimental/dl-debug

---
 kernel/sched/deadline.c | 2 +-
 kernel/sched/topology.c | 5 ++++-
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 9a47decd099a..504ff302299a 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -3545,7 +3545,7 @@ static int dl_bw_manage(enum dl_bw_request req, int cpu, u64 dl_bw)
 		 * dl_servers we can discount, as tasks will be moved out the
 		 * offlined CPUs anyway.
 		 */
-		printk_deferred("%s: cpu=%d cap=%lu fair_server_bw=%llu total_bw=%llu dl_bw_cpus=%d\n", __func__, cpu, cap, fair_server_bw, dl_b->total_bw, dl_bw_cpus(cpu));
+		printk_deferred("%s: cpu=%d cap=%lu fair_server_bw=%llu total_bw=%llu dl_bw_cpus=%d type=%s span=%*pbl\n", __func__, cpu, cap, fair_server_bw, dl_b->total_bw, dl_bw_cpus(cpu), (cpu_rq(cpu)->rd == &def_root_domain) ? "DEF" : "DYN", cpumask_pr_args(cpu_rq(cpu)->rd->span));
 		if (dl_b->total_bw - fair_server_bw > 0) {
 			/*
 			 * Leaving at least one CPU for DEADLINE tasks seems a
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index 93b08e76a52a..996270cd5bd2 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -137,6 +137,7 @@ static void sched_domain_debug(struct sched_domain *sd, int cpu)
 
 	if (!sd) {
 		printk(KERN_DEBUG "CPU%d attaching NULL sched-domain.\n", cpu);
+		printk(KERN_CONT "span=%*pbl\n", cpumask_pr_args(def_root_domain.span));
 		return;
 	}
 
@@ -2534,8 +2535,10 @@ build_sched_domains(const struct cpumask *cpu_map, struct sched_domain_attr *att
 	if (has_cluster)
 		static_branch_inc_cpuslocked(&sched_cluster_active);
 
-	if (rq && sched_debug_verbose)
+	if (rq && sched_debug_verbose) {
 		pr_info("root domain span: %*pbl\n", cpumask_pr_args(cpu_map));
+		pr_info("default domain span: %*pbl\n", cpumask_pr_args(def_root_domain.span));
+	}
 
 	ret = 0;
 error:


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ