lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <a04892df1ad94df2bbddd7378c39342a@honor.com>
Date: Wed, 11 Jun 2025 07:17:57 +0000
From: liuwenfang <liuwenfang@...or.com>
To: 'Andrea Righi' <arighi@...dia.com>
CC: 'Tejun Heo' <tj@...nel.org>, 'David Vernet' <void@...ifault.com>,
	'Changwoo Min' <changwoo@...lia.com>, 'Ingo Molnar' <mingo@...hat.com>,
	'Peter Zijlstra' <peterz@...radead.org>, 'Juri Lelli'
	<juri.lelli@...hat.com>, 'Vincent Guittot' <vincent.guittot@...aro.org>,
	'Dietmar Eggemann' <dietmar.eggemann@....com>, 'Steven Rostedt'
	<rostedt@...dmis.org>, 'Ben Segall' <bsegall@...gle.com>, 'Mel Gorman'
	<mgorman@...e.de>, 'Valentin Schneider' <vschneid@...hat.com>,
	"'linux-kernel@...r.kernel.org'" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] sched_ext: Fix NULL pointer dereferences in
 put_prev_task_scx

> 
> Hi liuwenfang,
> 
> On Tue, Jun 10, 2025 at 06:22:22AM +0000, liuwenfang wrote:
> > Thanks for your feedback.
> > This is triggered in kernel modules developed In the mobile scenario:
> > tasks on rq are migrated while the current cpu should be halted for power
> saving policy.
> > Its migration logic:
> > drain_rq_cpu_stop -- migrate_all_tasks :
> > for (;;) {
> > 	if (rq->nr_running == 1)
> > 		break;
> > 	for_each_class(class) {
> > 		next = class->pick_next_task(rq);
> > 		if (next) {
> > 			next->sched_class->put_prev_task(rq, next, NULL);
> 
> Can you do something like this instead?
> 
>   next->sched_class->put_prev_task(rq, next, next);
> 
> this should give you the same behavior you're relying on, without requiring the
> extra check in ext.c.

Yes, this can help me.

> 
> > 			break;
> > 		}
> > 	}
> > 	if (is_idle_task(next))
> > 		break;
> > 	dest_cpu = select_task_rq(next...);
> > 	move_queued_task(rq, rf, next, dest_cpu);
> > 	...
> > }
> >
> > put_prev_task in this function is selected to update util and
> > statistics info for each runnable tasks, here they are not dequeued yet.
> 
> I see, so it's an optimization/workaround to update utilization info without fully
> dequeuing the tasks. Is this out-of-tree code, I guess?
> 

Yes, this is out-of-tree code. 

> When you say the CPU is halted, we're not talking about CPU hotplugging, right?
> We're just migrating tasks off the CPU?
> 

Yes, this is one power-saving method to control tasks placement by bypassing
Certain CPUs in software, simply letting the halted CPU enter an idle state.
It replaces CPU hotplugging to reduce costs in mobile gaming scenario. 

> Also, if you're running sched_ext there are ways to exclude certain CPUs at the
> scheduler's level via ops.select_cpu() / ops.dispatch(). Do you think this could be
> a viable solution for your specific use case?
> 

Yes, Thanks very much, This is exactly the approach we have implemented.

Best regards

> Thanks,
> -Andrea
> 
> PS https://subspace.kernel.org/etiquette.html#do-not-top-post-when-replying
> 
> >
> > Best regards,
> >
> > > On Mon, Jun 09, 2025 at 11:36:15AM +0000, liuwenfang wrote:
> > > > As put_prev_task can be used in other kernel modules which can
> > > > lead to a NULL pointer. Fix this by checking for a valid next.
> > >
> > > Actually, put_prev_task() should be used only within kernel/sched/
> > > and, in theory, you should have done a dequeue_task() before
> > > put_prev_task() in this scenario, so SCX_TASK_QUEUED shouldn't be set in
> p->scx.flags.
> > >
> > > The change might still make sense, but can you clarify how you
> > > triggered the NULL pointer dereference?
> > >
> > > Thanks,
> > > -Andrea
> > >
> > > >
> > > > Signed-off-by: l00013971 <l00013971@...onor.com>
> > > > ---
> > > >  kernel/sched/ext.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c index
> > > > f5133249f..6a579babd 100644
> > > > --- a/kernel/sched/ext.c
> > > > +++ b/kernel/sched/ext.c
> > > > @@ -3262,7 +3262,7 @@ static void put_prev_task_scx(struct rq *rq,
> > > > struct
> > > task_struct *p,
> > > >  		 * ops.enqueue() that @p is the only one available for this cpu,
> > > >  		 * which should trigger an explicit follow-up scheduling event.
> > > >  		 */
> > > > -		if (sched_class_above(&ext_sched_class, next->sched_class)) {
> > > > +		if (next && sched_class_above(&ext_sched_class,
> > > > +next->sched_class)) {
> > > >
> 	WARN_ON_ONCE(!static_branch_unlikely(&scx_ops_enq_last));
> > > >  			do_enqueue_task(rq, p, SCX_ENQ_LAST, -1);
> > > >  		} else {
> > > > --
> > > > 2.17.1

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ