[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20260203103407.GK1282955@noisy.programming.kicks-ass.net>
Date: Tue, 3 Feb 2026 11:34:07 +0100
From: Peter Zijlstra <peterz@...radead.org>
To: Andrea Righi <arighi@...dia.com>
Cc: Ingo Molnar <mingo@...hat.com>, Juri Lelli <juri.lelli@...hat.com>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
Joel Fernandes <joelagnelf@...dia.com>,
David Vernet <void@...ifault.com>,
Changwoo Min <changwoo@...lia.com>,
Daniel Hodges <hodgesd@...a.com>,
Christian Loehle <christian.loehle@....com>,
Emil Tsalapatis <emil@...alapatis.com>, sched-ext@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/7] sched/debug: Stop and start server based on if it
was active
On Mon, Feb 02, 2026 at 11:37:31PM +0100, Andrea Righi wrote:
> Or:
>
> pr_info("%s server %sabled in CPU %d%s\n",
> server == &rq->fair_server ? "Fair" : "Ext",
> runtime ? "en" : "dis",
> cpu_of(rq),
> runtime ? "" : ", system may crash due to starvation");
Yeah, I noticed it was a bit wonkey. I made it thus.
> > + }
> > +
> > *ppos += cnt;
> > return cnt;
> > }
>
> I like that, it should fix the issue.
There is one more issue when dl_server_apply_params() fails, in that
case we should test old_runtime to determine if we should (re)start the
dl_server.
I've ended up with this.
---
Subject: sched/debug: Fix dl_server (re)start conditions
From: Peter Zijlstra <peterz@...radead.org>
Date: Tue Feb 3 11:05:12 CET 2026
There are two problems with sched_server_write_common() that can cause the
dl_server to malfunction upon attempting to change the parameters:
1) when, after having disabled the dl_server by setting runtime=0, it is
enabled again while tasks are already enqueued. In this case is_active would
still be 0 and dl_server_start() would not be called.
2) when dl_server_apply_params() would fail, runtime is not applied and does
not reflect the new state.
Instead have dl_server_start() check its actual dl_runtime, and have
sched_server_write_common() unconditionally (re)start the dl_server. It will
automatically stop if there isn't anything to do, so spurious activation is
harmless -- while failing to start it is a problem.
While there, move the printk out of the locked region and make it symmetric,
also printing on enable.
Signed-off-by: Peter Zijlstra (Intel) <peterz@...radead.org>
---
kernel/sched/deadline.c | 5 ++---
kernel/sched/debug.c | 32 ++++++++++++++------------------
2 files changed, 16 insertions(+), 21 deletions(-)
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1784,7 +1784,7 @@ void dl_server_start(struct sched_dl_ent
{
struct rq *rq = dl_se->rq;
- if (!dl_server(dl_se) || dl_se->dl_server_active)
+ if (!dl_server(dl_se) || dl_se->dl_server_active || !dl_se->dl_runtime)
return;
/*
@@ -1882,7 +1882,6 @@ int dl_server_apply_params(struct sched_
int cpu = cpu_of(rq);
struct dl_bw *dl_b;
unsigned long cap;
- int retval = 0;
int cpus;
dl_b = dl_bw_of(cpu);
@@ -1914,7 +1913,7 @@ int dl_server_apply_params(struct sched_
dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
dl_se->dl_density = to_ratio(dl_se->dl_deadline, dl_se->dl_runtime);
- return retval;
+ return 0;
}
/*
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -338,9 +338,9 @@ static ssize_t sched_server_write_common
void *server)
{
long cpu = (long) ((struct seq_file *) filp->private_data)->private;
- struct rq *rq = cpu_rq(cpu);
struct sched_dl_entity *dl_se = (struct sched_dl_entity *)server;
- u64 runtime, period;
+ u64 old_runtime, runtime, period;
+ struct rq *rq = cpu_rq(cpu);
int retval = 0;
size_t err;
u64 value;
@@ -350,9 +350,7 @@ static ssize_t sched_server_write_common
return err;
scoped_guard (rq_lock_irqsave, rq) {
- bool is_active;
-
- runtime = dl_se->dl_runtime;
+ old_runtime = runtime = dl_se->dl_runtime;
period = dl_se->dl_period;
switch (param) {
@@ -374,25 +372,23 @@ static ssize_t sched_server_write_common
return -EINVAL;
}
- is_active = dl_server_active(dl_se);
- if (is_active) {
- update_rq_clock(rq);
- dl_server_stop(dl_se);
- }
-
+ update_rq_clock(rq);
+ dl_server_stop(dl_se);
retval = dl_server_apply_params(dl_se, runtime, period, 0);
-
- if (!runtime)
- printk_deferred("%s server disabled in CPU %d, system may crash due to starvation.\n",
- server == &rq->fair_server ? "Fair" : "Ext", cpu_of(rq));
-
- if (is_active && runtime)
- dl_server_start(dl_se);
+ dl_server_start(dl_se);
if (retval < 0)
return retval;
}
+ if (!!old_runtime ^ !!runtime) {
+ pr_info("%s server %sabled on CPU %d%s.\n",
+ server == &rq->fair_server ? "Fair" : "Ext",
+ runtime ? "en" : "dis",
+ cpu_of(rq),
+ runtime ? "" : ", system may malfunction due to starvation");
+ }
+
*ppos += cnt;
return cnt;
}
Powered by blists - more mailing lists