linux-kernel - Re: [RFC][PATCH] sched/ext: Avoid null ptr traversal when ->put_prev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <DESWM9HEHSK3.34TGESJUP6IW8@google.com>
Date: Mon, 08 Dec 2025 14:27:47 +0000
From: Kuba Piecuch <jpiecuch@...gle.com>
To: Kuba Piecuch <jpiecuch@...gle.com>, John Stultz <jstultz@...gle.com>, 
	LKML <linux-kernel@...r.kernel.org>
Cc: Joel Fernandes <joelagnelf@...dia.com>, Qais Yousef <qyousef@...alina.io>, 
	Ingo Molnar <mingo@...hat.com>, Peter Zijlstra <peterz@...radead.org>, 
	Juri Lelli <juri.lelli@...hat.com>, Vincent Guittot <vincent.guittot@...aro.org>, 
	Dietmar Eggemann <dietmar.eggemann@....com>, Valentin Schneider <vschneid@...hat.com>, 
	Steven Rostedt <rostedt@...dmis.org>, Ben Segall <bsegall@...gle.com>, 
	Zimuzo Ezeozue <zezeozue@...gle.com>, Mel Gorman <mgorman@...e.de>, Will Deacon <will@...nel.org>, 
	Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>, 
	"Paul E. McKenney" <paulmck@...nel.org>, Metin Kaya <Metin.Kaya@....com>, 
	Xuewen Yan <xuewen.yan94@...il.com>, K Prateek Nayak <kprateek.nayak@....com>, 
	Thomas Gleixner <tglx@...utronix.de>, Daniel Lezcano <daniel.lezcano@...aro.org>, 
	Suleiman Souhlal <suleiman@...gle.com>, kuyo chang <kuyo.chang@...iatek.com>, 
	hupu <hupu.gm@...il.com>, Tejun Heo <tj@...nel.org>, David Vernet <void@...ifault.com>, 
	Andrea Righi <arighi@...dia.com>, Changwoo Min <changwoo@...lia.com>, <sched-ext@...ts.linux.dev>, 
	<kernel-team@...roid.com>
Subject: Re: [RFC][PATCH] sched/ext: Avoid null ptr traversal when
 ->put_prev_task() is called with NULL next

On Mon Dec 8, 2025 at 11:15 AM UTC, Kuba Piecuch wrote:
> On Mon Dec 8, 2025 at 10:10 AM UTC, Kuba Piecuch wrote:
>> It looks like it's impossible for an outside observer holding a CPU's rq lock
>> to observe a task that is running on that CPU and isn't queued, i.e.
>> 'running' implies 'queued' (I'm new to the scheduler so I may be wrong here).
>
> A task that blocks in __schedule() can drop the rq lock while picking the next
> task, which is after try_to_block_task() dequeues prev. So it's very much
> possible for a task on another CPU to grab the rq lock and observe prev as
> dequeued but still running.

Even with that, I'm not convinced that it's possible to do a NULL deref with
the current code.

In order for sched_change_begin() to do the NULL deref in put_prev_task_scx(),
we would need to have:

* rq->donor == p (for sched_change_begin() to call put_prev_task())
* p->on_rq != TASK_ON_RQ_QUEUED
  (for sched_change_begin() to not call dequeue_task() beforehand)
* p->scx.flags & SCX_TASK_QUEUED
  (for put_prev_task_scx() to enter the branch with the @next deref)

>From a brief survey of the code, __assuming proxy execution is disabled__,
I don't think it's possible for a remote task holding @rq's lock to observe
the second and third condition to be true.

Every time p->on_rq is changed away from TASK_ON_RQ_QUEUED, it happens under
the rq lock and is paired with a dequeue (see block_task(),
deactivate_task()). dequeue_task_scx() always clears SCX_TASK_QUEUED from
p->scx.flags.

Every time SCX_TASK_QUEUED is set in p->scx.flags (i.e. enqueue_task_scx()
is called), it happens under the rq lock and is either gated by
p->on_rq == TASK_ON_RQ_QUEUED (see ttwu_runnable(), sched_change_end()) or is
paired with p->on_rq being set to TASK_ON_RQ_QUEUED (see activate_task()).
It also happens in proxy_tag_curr(), which is a no-op if proxy execution is
disabled. Even when it's enabled, proxy_tag_curr() does a dequeue-enqueue
cycle while holding the rq lock, which doesn't look dangerous.

I'm not trying to say that we shouldn't add a NULL check, all this is just
for my own understanding.