[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aPY9STUzTnNwKZ9V@jlelli-thinkpadt14gen4.remote.csb>
Date: Mon, 20 Oct 2025 15:46:49 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Andrea Righi <arighi@...dia.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
Joel Fernandes <joelagnelf@...dia.com>, Tejun Heo <tj@...nel.org>,
David Vernet <void@...ifault.com>,
Changwoo Min <changwoo@...lia.com>, Shuah Khan <shuah@...nel.org>,
sched-ext@...ts.linux.dev, bpf@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH 08/14] sched/deadline: Add support to remove DL server's
bandwidth contribution
Hi!
On 17/10/25 11:25, Andrea Righi wrote:
> During switching from sched_ext to FAIR tasks and vice-versa, we need
> support for removing the bandwidth contribution of either DL server. Add
> support for the same.
>
> Co-developed-by: Joel Fernandes <joelagnelf@...dia.com>
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> Signed-off-by: Andrea Righi <arighi@...dia.com>
> ---
> kernel/sched/deadline.c | 31 +++++++++++++++++++++++++++++++
> kernel/sched/sched.h | 1 +
> 2 files changed, 32 insertions(+)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index 3c1fd2190949e..d585be4014427 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1684,6 +1684,12 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
> dl_rq_change_utilization(rq, dl_se, new_bw);
> }
>
> + /* Clear these so that the dl_server is reinitialized */
> + if (new_bw == 0) {
> + dl_se->dl_defer = 0;
> + dl_se->dl_server = 0;
> + }
> +
> dl_se->dl_runtime = runtime;
> dl_se->dl_deadline = period;
> dl_se->dl_period = period;
> @@ -1697,6 +1703,31 @@ int dl_server_apply_params(struct sched_dl_entity *dl_se, u64 runtime, u64 perio
> return retval;
> }
>
> +/**
> + * dl_server_remove_params - Remove bandwidth reservation for a DL server
> + * @dl_se: The DL server entity to remove bandwidth for
> + *
> + * This function removes the bandwidth reservation for a DL server entity,
> + * cleaning up all bandwidth accounting and server state.
> + *
> + * Returns: 0 on success, negative error code on failure
> + */
> +int dl_server_remove_params(struct sched_dl_entity *dl_se)
> +{
> + if (!dl_se->dl_server)
> + return 0; /* Already disabled */
> +
> + /*
> + * First dequeue if still queued. It should not be queued since
> + * we call this only after the last dl_server_stop().
> + */
> + if (WARN_ON_ONCE(on_dl_rq(dl_se)))
> + dequeue_dl_entity(dl_se, DEQUEUE_SLEEP);
> +
> + /* Remove bandwidth reservation */
> + return dl_server_apply_params(dl_se, 0, dl_se->dl_period, false);
> +}
I am not sure this is correct wrt inactive_task_timer and
task_non_contending. I fear that removing bw immediately might break
other deadline entities guarantees (especially if one then maliciously
add/removes dl-servers quickly). I kind of think (but again not sure,
please Peter and other keep me honest :) we should be waiting for
inactive_task_timer to fire (if stopping before 0-lag) and let it clean
things up at that point (like we do for simple tasks).
You seem to have additional fixes later on in the series that might be
caused by what I describe above.
Thinking more about this I actually wonder if we need this (well it's
coming up with later patches) mechanism for automatically removing
servers based on fair vs. scx state (full/partial). If we are going to
manage dl-servers bw explicitly and separately [1], maybe we can just
leave the burden to the user (or middleware) of doing that via the
configuration interface?
Thanks,
Juri
1 - https://lore.kernel.org/lkml/aPYDhjqe99F91FTW@jlelli-thinkpadt14gen4.remote.csb/
Powered by blists - more mailing lists