[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aOYSsrIjzcq5Av34@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 8 Oct 2025 09:28:50 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Yuri Andriaccio <yurand2000@...il.com>
Cc: Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>,
linux-kernel@...r.kernel.org,
Luca Abeni <luca.abeni@...tannapisa.it>,
Yuri Andriaccio <yuri.andriaccio@...tannapisa.it>
Subject: Re: [RFC PATCH v3 09/24] sched/rt: Add {alloc/free}_rt_sched_group
Hello,
On 29/09/25 11:22, Yuri Andriaccio wrote:
> From: luca abeni <luca.abeni@...tannapisa.it>
>
> Add allocation and deallocation code for rt-cgroups.
> Declare dl_server specific functions (no implementation yet), needed by
> the allocation code.
>
> Co-developed-by: Alessio Balsini <a.balsini@...up.it>
> Signed-off-by: Alessio Balsini <a.balsini@...up.it>
> Co-developed-by: Andrea Parri <parri.andrea@...il.com>
> Signed-off-by: Andrea Parri <parri.andrea@...il.com>
> Co-developed-by: Yuri Andriaccio <yurand2000@...il.com>
> Signed-off-by: Yuri Andriaccio <yurand2000@...il.com>
> Signed-off-by: luca abeni <luca.abeni@...tannapisa.it>
> ---
> kernel/sched/rt.c | 89 +++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 89 insertions(+)
>
> diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
> index 84dbb4853b6..3094f59d0c8 100644
> --- a/kernel/sched/rt.c
> +++ b/kernel/sched/rt.c
> @@ -93,8 +93,43 @@ void unregister_rt_sched_group(struct task_group *tg)
>
> void free_rt_sched_group(struct task_group *tg)
> {
> + int i;
> + unsigned long flags;
> + struct rq *served_rq;
> +
> if (!rt_group_sched_enabled())
> return;
> +
> + if (!tg->dl_se || !tg->rt_rq)
> + return;
> +
> + for_each_possible_cpu(i) {
> + if (!tg->dl_se[i] || !tg->rt_rq[i])
> + continue;
> +
> + /*
> + * Shutdown the dl_server and free it
> + *
> + * Since the dl timer is going to be cancelled,
> + * we risk to never decrease the running bw...
> + * Fix this issue by changing the group runtime
> + * to 0 immediately before freeing it.
> + */
> + dl_init_tg(tg->dl_se[i], 0, tg->dl_se[i]->dl_period);
> +
> + raw_spin_rq_lock_irqsave(cpu_rq(i), flags);
> + BUG_ON(tg->rt_rq[i]->rt_nr_running);
Crashing is probably a bit harsh. What didn't happen that should have
happened if there are still tasks in the group at this point? Can we
maybe WARN and try to recover w/o crashing?
> + hrtimer_cancel(&tg->dl_se[i]->dl_timer);
> + raw_spin_rq_unlock_irqrestore(cpu_rq(i), flags);
> + kfree(tg->dl_se[i]);
> +
> + /* Free the local per-cpu runqueue */
> + served_rq = container_of(tg->rt_rq[i], struct rq, rt);
> + kfree(served_rq);
> + }
> +
> + kfree(tg->rt_rq);
> + kfree(tg->dl_se);
...
> int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
> {
> + struct rq *s_rq;
> + struct sched_dl_entity *dl_se;
> + int i;
> +
> if (!rt_group_sched_enabled())
> return 1;
>
> + tg->rt_rq = kcalloc(nr_cpu_ids, sizeof(struct rt_rq *), GFP_KERNEL);
> + if (!tg->rt_rq)
> + return 0;
> +
> + tg->dl_se = kcalloc(nr_cpu_ids, sizeof(dl_se), GFP_KERNEL);
Nit. Maybe sizeof(struct sched_dl_entity *) is more clear and consistent
with the above (current form is still correct, of course).
> + if (!tg->dl_se) {
> + kfree(tg->rt_rq);
> + tg->rt_rq = NULL;
> + return 0;
> + }
> +
> + init_dl_bandwidth(&tg->dl_bandwidth, 0, 0);
> +
> + for_each_possible_cpu(i) {
> + s_rq = kzalloc_node(sizeof(struct rq),
> + GFP_KERNEL, cpu_to_node(i));
> + if (!s_rq)
> + return 0;
> +
> + dl_se = kzalloc_node(sizeof(struct sched_dl_entity),
> + GFP_KERNEL, cpu_to_node(i));
> + if (!dl_se) {
> + kfree(s_rq);
> + return 0;
> + }
Are we not leaking if allocation failure happens mid way during this for
loop? Do we free data structures allocated for previous CPUs?
> +
> + init_rt_rq(&s_rq->rt);
> + init_dl_entity(dl_se);
> + dl_se->dl_runtime = tg->dl_bandwidth.dl_runtime;
> + dl_se->dl_period = tg->dl_bandwidth.dl_period;
> + dl_se->dl_deadline = dl_se->dl_period;
> + dl_se->dl_bw = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> + dl_se->dl_density = to_ratio(dl_se->dl_period, dl_se->dl_runtime);
> + dl_se->dl_server = 1;
> +
> + dl_server_init(dl_se, &cpu_rq(i)->dl, s_rq, rt_server_has_tasks, rt_server_pick);
> +
> + init_tg_rt_entry(tg, s_rq, dl_se, i, parent->dl_se[i]);
> + }
> +
> return 1;
> }
Thanks,
Juri
Powered by blists - more mailing lists