lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250630144534.GA2439051@joelnvbox>
Date: Mon, 30 Jun 2025 10:45:34 -0400
From: Joel Fernandes <joelagnelf@...dia.com>
To: Juri Lelli <juri.lelli@...hat.com>
Cc: linux-kernel@...r.kernel.org, Ingo Molnar <mingo@...hat.com>,
	Peter Zijlstra <peterz@...radead.org>,
	Vincent Guittot <vincent.guittot@...aro.org>,
	Dietmar Eggemann <dietmar.eggemann@....com>,
	Steven Rostedt <rostedt@...dmis.org>,
	Ben Segall <bsegall@...gle.com>, Mel Gorman <mgorman@...e.de>,
	Valentin Schneider <vschneid@...hat.com>, Tejun Heo <tj@...nel.org>,
	David Vernet <void@...ifault.com>, Andrea Righi <arighi@...dia.com>,
	Changwoo Min <changwoo@...lia.com>
Subject: Re: [PATCH v5 04/14] sched/deadline: Prevent setting server as
 started if params couldn't be applied

On Wed, Jun 25, 2025 at 02:56:27PM +0200, Juri Lelli wrote:
> Hi Joel,
> 
> On 20/06/25 16:32, Joel Fernandes wrote:
> > The following call trace fails to set dl_server_apply_params() as
> > dl_bw_cpus() is 0 during CPU onlining.
> > 
> > [   11.878356] ------------[ cut here ]------------
> > [   11.882592]  <TASK>
> > [   11.882685]  enqueue_task_scx+0x190/0x280
> > [   11.882802]  ttwu_do_activate+0xaa/0x2a0
> > [   11.882925]  try_to_wake_up+0x371/0x600
> > [   11.883047]  cpuhp_bringup_ap+0xd6/0x170
> > [   11.883172]  cpuhp_invoke_callback+0x142/0x540
> > [   11.883327]  _cpu_up+0x15b/0x270
> > [   11.883450]  cpu_up+0x52/0xb0
> > [   11.883576]  cpu_subsys_online+0x32/0x120
> > [   11.883704]  online_store+0x98/0x130
> > [   11.883824]  kernfs_fop_write_iter+0xeb/0x170
> > [   11.883972]  vfs_write+0x2c7/0x430
> > [   11.884091]  ksys_write+0x70/0xe0
> > [   11.884209]  do_syscall_64+0xd6/0x250
> > [   11.884327]  ? clear_bhb_loop+0x40/0x90
> > [   11.884443]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
> > 
> > It is too early to start the server. Simply defer the starting of the
> > server to the next enqueue if dl_server_apply_params() returns an error.
> > In any case, we should not pretend like the server started and it does
> > seem to mess up with the sched_ext CPU hotplug test.
> > 
> > With this, the sched_ext hotplug test reliably passes.
> > 
> > Signed-off-by: Joel Fernandes <joelagnelf@...dia.com>
> > ---
> >  kernel/sched/deadline.c | 6 +++---
> >  1 file changed, 3 insertions(+), 3 deletions(-)
> > 
> > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
> > index ae15ec6294cf..d2eb31b45ba9 100644
> > --- a/kernel/sched/deadline.c
> > +++ b/kernel/sched/deadline.c
> > @@ -1656,8 +1656,8 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> >  		u64 runtime =  50 * NSEC_PER_MSEC;
> >  		u64 period = 1000 * NSEC_PER_MSEC;
> >  
> > -		dl_server_apply_params(dl_se, runtime, period, 1);
> > -
> > +		if (dl_server_apply_params(dl_se, runtime, period, 1))
> > +			return;
> >  		dl_se->dl_server = 1;
> >  		dl_se->dl_defer = 1;
> >  		setup_new_dl_entity(dl_se);
> > @@ -1674,7 +1674,7 @@ void dl_server_start(struct sched_dl_entity *dl_se)
> 
> I wonder if the above will still be needed if we move dl-servers init
> further down (later) the boot process as I am trying to do to fix issues
> reported at [1]. Also, failing to apply parameters at boot is not very
> nice I am thinking. Wondering if we should WARN or do something about
> it.

I will add a warning as you suggest. I think it might be still be needed, if
for whatever reason, the apply_params fail and we end up warning? I think
your new patch should also include a warning.

I suggest, if its Ok with you, let us take my change unless your patch is
getting merged soon'ish. But if your merging is imminent, I'm happy to rebase
and test again, let me know, thanks!

> >  void dl_server_stop(struct sched_dl_entity *dl_se)
> >  {
> > -	if (!dl_se->dl_runtime)
> > +	if (!dl_se->dl_runtime || !dl_se->dl_server_active)
> >  		return;
> >  
> >  	dequeue_dl_entity(dl_se, DEQUEUE_SLEEP);
> 
> Yeah, I ended up adding the same check as a fixup in my WIP fix series.
> :-)
> 
> https://github.com/jlelli/linux/commits/upstream/fix-grub/
> 
> 1 - https://lore.kernel.org/lkml/aFvLYv0xSXxoyZZ8@jlelli-thinkpadt14gen4.remote.csb/

Ah, cool. :-) 

thanks,

 - Joel

> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ