lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203103407.GK1282955@noisy.programming.kicks-ass.net>
Date: Tue, 3 Feb 2026 11:34:07 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Andrea Righi <arighi@...dia.com>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
	Joel Fernandes <joelagnelf@...dia.com>,
	David Vernet <void@...ifault.com>,
	Changwoo Min <changwoo@...lia.com>,
	Daniel Hodges <hodgesd@...a.com>,
	Christian Loehle <christian.loehle@....com>,
	Emil Tsalapatis <emil@...alapatis.com>, sched-ext@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/7] sched/debug: Stop and start server based on if it
 was active

On Mon, Feb 02, 2026 at 11:37:31PM +0100, Andrea Righi wrote:

> Or:
> 
>     pr_info("%s server %sabled in CPU %d%s\n",
>               server == &rq->fair_server ? "Fair" : "Ext",
>               runtime ? "en" : "dis",
>               cpu_of(rq),
>               runtime ? "" : ", system may crash due to starvation");

Yeah, I noticed it was a bit wonkey. I made it thus.

> > +	}
> > +
> >  	*ppos += cnt;
> >  	return cnt;
> >  }
> 
> I like that, it should fix the issue.

There is one more issue when dl_server_apply_params() fails, in that
case we should test old_runtime to determine if we should (re)start the
dl_server.

I've ended up with this.

---
Subject: sched/debug: Fix dl_server (re)start conditions
From: Peter Zijlstra <peterz@...radead.org>
Date: Tue Feb 3 11:05:12 CET 2026

There are two problems with sched_server_write_common() that can cause the
dl_server to malfunction upon attempting to change the parameters:

1) when, after having disabled the dl_server by setting runtime=0, it is
   enabled again while tasks are already enqueued. In this case is_active would
   still be 0 and dl_server_start() would not be called.

2) when dl_server_apply_params() would fail, runtime is not applied and does
   not reflect the new state.

Instead have dl_server_start() check its actual dl_runtime, and have
sched_server_write_common() unconditionally (re)start the dl_server. It will
automatically stop if there isn't anything to do, so spurious activation is
harmless -- while failing to start it is a problem.

While there, move the printk out of the locked region and make it symmetric,
also printing on enable.

Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
 kernel/sched/deadline.c |    5 ++---
 kernel/sched/debug.c    |   32 ++++++++++++++------------------
 2 files changed, 16 insertions(+), 21 deletions(-)

--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1784,7 +1784,7 @@ void dl_server_start(struct sched_dl_ent
 {
 	struct rq *rq = dl_se->rq;
 
-	if (!dl_server(dl_se) || dl_se->dl_server_active)
+	if (!dl_server(dl_se) || dl_se->dl_server_active || !dl_se->dl_runtime)
 		return;
 
 	/*
@@ -1882,7 +1882,6 @@ int dl_server_apply_params(struct sched_
 	int cpu = cpu_of(rq);
 	struct dl_bw *dl_b;
 	unsigned long cap;
-	int retval = 0;
 	int cpus;
 
 	dl_b = dl_bw_of(cpu);
@@ -1914,7 +1913,7 @@ int dl_server_apply_params(struct sched_
 	dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
 	dl_se->dl_density = to_ratio(dl_se->dl_deadline, dl_se->dl_runtime);
 
-	return retval;
+	return 0;
 }
 
 /*
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -338,9 +338,9 @@ static ssize_t sched_server_write_common
 					 void *server)
 {
 	long cpu = (long) ((struct seq_file *) filp->private_data)->private;
-	struct rq *rq = cpu_rq(cpu);
 	struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
-	u64 runtime, period;
+	u64 old_runtime, runtime, period;
+	struct rq *rq = cpu_rq(cpu);
 	int retval = 0;
 	size_t err;
 	u64 value;
@@ -350,9 +350,7 @@ static ssize_t sched_server_write_common
 		return err;
 
 	scoped_guard (rq_lock_irqsave, rq) {
-		bool is_active;
-
-		runtime = dl_se->dl_runtime;
+		old_runtime = runtime = dl_se->dl_runtime;
 		period = dl_se->dl_period;
 
 		switch (param) {
@@ -374,25 +372,23 @@ static ssize_t sched_server_write_common
 			return  -EINVAL;
 		}
 
-		is_active = dl_server_active(dl_se);
-		if (is_active) {
-			update_rq_clock(rq);
-			dl_server_stop(dl_se);
-		}
-
+		update_rq_clock(rq);
+		dl_server_stop(dl_se);
 		retval = dl_server_apply_params(dl_se, runtime, period, 0);
-
-		if (!runtime)
-			printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
-					server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
-
-		if (is_active && runtime)
-			dl_server_start(dl_se);
+		dl_server_start(dl_se);
 
 		if (retval < 0)
 			return retval;
 	}
 
+	if (!!old_runtime ^ !!runtime) {
+		pr_info("%s server %sabled on CPU %d%s.\n",
+			server == &rq->fair_server ? "Fair" : "Ext",
+			runtime ? "en" : "dis",
+			cpu_of(rq),
+			runtime ? "" : ", system may malfunction due to starvation");
+	}
+
 	*ppos += cnt;
 	return cnt;
 }

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ