[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aFvx-z7Dd_0mYMeg@jlelli-thinkpadt14gen4.remote.csb>
Date: Wed, 25 Jun 2025 14:56:27 +0200
From: Juri Lelli <juri.lelli@...hat.com>
To: Joel Fernandes <joelagnelf@...dia.com>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
Peter Zijlstra <peterz@...radead.org>,
Vincent Guittot <vincent.guittot@...aro.org>,
Dietmar Eggemann <dietmar.eggemann@....com>,
Steven Rostedt <rostedt@...dmis.org>,
Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
David Vernet <void@...ifault.com>, Andrea Righi <arighi@...dia.com>,
Changwoo Min <changwoo@...lia.com>
Subject: Re: [PATCH v5 04/14] sched/deadline: Prevent setting server as
started if params couldn't be applied
Hi Joel,
On 20/06/25 16:32, Joel Fernandes wrote:
> The following call trace fails to set dl_server_apply_params() as
> dl_bw_cpus() is 0 during CPU onlining.
>
> [ 11.878356] ------------[ cut here ]------------
> [ 11.882592] <TASK>
> [ 11.882685] enqueue_task_scx+0x190/0x280
> [ 11.882802] ttwu_do_activate+0xaa/0x2a0
> [ 11.882925] try_to_wake_up+0x371/0x600
> [ 11.883047] cpuhp_bringup_ap+0xd6/0x170
> [ 11.883172] cpuhp_invoke_callback+0x142/0x540
> [ 11.883327] _cpu_up+0x15b/0x270
> [ 11.883450] cpu_up+0x52/0xb0
> [ 11.883576] cpu_subsys_online+0x32/0x120
> [ 11.883704] online_store+0x98/0x130
> [ 11.883824] kernfs_fop_write_iter+0xeb/0x170
> [ 11.883972] vfs_write+0x2c7/0x430
> [ 11.884091] ksys_write+0x70/0xe0
> [ 11.884209] do_syscall_64+0xd6/0x250
> [ 11.884327] ? clear_bhb_loop+0x40/0x90
> [ 11.884443] entry_SYSCALL_64_after_hwframe+0x77/0x7f
>
> It is too early to start the server. Simply defer the starting of the
> server to the next enqueue if dl_server_apply_params() returns an error.
> In any case, we should not pretend like the server started and it does
> seem to mess up with the sched_ext CPU hotplug test.
>
> With this, the sched_ext hotplug test reliably passes.
>
> Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> ---
> kernel/sched/deadline.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> index ae15ec6294cf..d2eb31b45ba9 100644
> --- a/kernel/sched/deadline.c
> +++ b/kernel/sched/deadline.c
> @@ -1656,8 +1656,8 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> u64 runtime = 50 * NSEC_PER_MSEC;
> u64 period = 1000 * NSEC_PER_MSEC;
>
> - dl_server_apply_params(dl_se, runtime, period, 1);
> -
> + if (dl_server_apply_params(dl_se, runtime, period, 1))
> + return;
> dl_se->dl_server = 1;
> dl_se->dl_defer = 1;
> setup_new_dl_entity(dl_se);
> @@ -1674,7 +1674,7 @@ void dl_server_start(struct sched_dl_entity *dl_se)
I wonder if the above will still be needed if we move dl-servers init
further down (later) the boot process as I am trying to do to fix issues
reported at [1]. Also, failing to apply parameters at boot is not very
nice I am thinking. Wondering if we should WARN or do something about
it.
> void dl_server_stop(struct sched_dl_entity *dl_se)
> {
> - if (!dl_se->dl_runtime)
> + if (!dl_se->dl_runtime || !dl_se->dl_server_active)
> return;
>
> dequeue_dl_entity(dl_se, DEQUEUE_SLEEP);
Yeah, I ended up adding the same check as a fixup in my WIP fix series.
:-)
https://github.com/jlelli/linux/commits/upstream/fix-grub/
1 - https://lore.kernel.org/lkml/aFvLYv0xSXxoyZZ8@jlelli-thinkpadt14gen4.remote.csb/
Powered by blists - more mailing lists