[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0e76c836-c645-4fd0-9d86-b47b8834eac2@amd.com>
Date: Fri, 9 Jan 2026 11:47:10 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: Aaron Tomlin <atomlin@...mlin.com>, <mingo@...hat.com>,
<peterz@...radead.org>, <juri.lelli@...hat.com>,
<vincent.guittot@...aro.org>, <dietmar.eggemann@....com>,
<rostedt@...dmis.org>, <bsegall@...gle.com>, <mgorman@...e.de>,
<vschneid@...hat.com>
CC: <sshegde@...ux.ibm.com>, <neelx@...e.com>, <sean@...e.io>,
<mproche@...il.com>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH 1/1] sched/deadline: Log Fair Server re-enablement for
symmetry with debugfs
On 1/9/2026 8:49 AM, Aaron Tomlin wrote:
> Currently, the scheduler's debug interface emits a notification to the
> console when the Fair Server is explicitly disabled via the fair_server
> sysfs attribute. However, no corresponding log entry is generated when
> the server is subsequently re-enabled.
>
> This omission results in an asymmetry within the kernel logs,
> potentially obscuring the true operational state of the scheduler during
> debugging or performance analysis.
Well, if you are disabling the fair_server, you're opening the doors to
bigger problems and that printk mainly serves as an indicator to dismiss
user induced starvation issues during debugs.
Why do you care about the symmetry of this log when you shouldn't be
setting the runtime to 0 in the first place?
>
> This patch amends dl_server_apply_params() to introduce the requisite
> logging. By detecting the transition from zero to non-zero
> bandwidth - strictly for the Fair Server entity and excluding
> initialisation - we ensure that a "Fair server re-enabled" message is
> emitted. This restores logging symmetry and provides administrators with
> a clear audit trail of manual runtime adjustments.
>
> Signed-off-by: Aaron Tomlin <atomlin@...mlin.com>
> ---
> kernel/sched/deadline.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 319439fe1870..e64fb988e957 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1867,6 +1867,7 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
> u64 old_bw = init ? 0 : to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> u64 new_bw = to_ratio(period, runtime);
> struct rq *rq = dl_se->rq;
> + bool fair_server = dl_se == &rq->fair_server;
> int cpu = cpu_of(rq);
> struct dl_bw *dl_b;
> unsigned long cap;
> @@ -1876,6 +1877,11 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
> dl_b = dl_bw_of(cpu);
> guard(raw_spinlock)(&dl_b->lock);
>
> + /* Symmetric to disable message in sched_fair_server_write() */
> + if (!init && fair_server && !old_bw && new_bw)
> + printk_deferred("Fair server re-enabled on CPU %d.\n",
> + cpu);
That is an absolutely terrible place to put it. Why can't we have it in
sched_fair_server_write() for DL_RUNTIME when the
"rq->fair_server.dl_runtime" is 0 initially and is modified to a
non-zero value similar to the "Fair server disabled" message?
I still think once the fair server is disabled, the pieces are for the
user to keep. I wouldn't want us debugging:
Fair server disabled in CPU X ...
Fair server re-enabled in CPU X ...
INFO: rcu_tasks detected stalls ...
only to realise the stalls were a result of starving the fair threads
and the fair server didn't run in time / didn't have enough B/W to
prevent that stall.
> +
> cpus = dl_bw_cpus(cpu);
> cap = dl_bw_capacity(cpu);
>
--
Thanks and Regards,
Prateek
Powered by blists - more mailing lists