[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aYH_urViwKNiamo0@gpd4>
Date: Tue, 3 Feb 2026 15:01:30 +0100
From: Andrea Righi <arighi@...dia.com>
To: Tejun Heo <tj@...nel.org>
Cc: David Vernet <void@...ifault.com>, Changwoo Min <changwoo@...lia.com>,
Emil Tsalapatis <emil@...alapatis.com>, sched-ext@...ts.linux.dev,
linux-kernel@...r.kernel.org
Subject: Re: [PATCH] sched_ext: Fix NULL pointer deref and warnings during
scx teardown
On Mon, Feb 02, 2026 at 11:50:05PM +0100, Andrea Righi wrote:
> On Mon, Feb 02, 2026 at 10:52:04AM -1000, Tejun Heo wrote:
> > On Mon, Feb 02, 2026 at 07:54:50PM +0100, Andrea Righi wrote:
> > > I'm able to reproduce the NULL pointer dereference in set_cpu_allowed_scx()
> > > quite easily running `stress-ng --race-sched 0` with an scx scheduler that
> > > is intentionally starving tasks, triggering a stall => disable.
> > >
> > > I think this is what's happening:
> > >
> > > CPU0 CPU1
> > > ---- ----
> > > __sched_setscheduler()
> > > task_rq_lock(p)
> > >
> > > next_class = __setscheduler_class()
> > > // next_class is ext_sched_class
> > > scx_disable_workfn()
> > > scx_set_enable_state(SCX_DISABLING)
> > >
> > > scx_task_iter_start()
> > > while ((p = next())) {
> > > ...
> > > p->sched_class = fair_sched_class
> > > ...
> > > }
> > > scx_task_iter_stop()
> > >
> > > synchronize_rcu()
> > > RCU_INIT_POINTER(scx_root, NULL)
> > >
> > > scoped_guard(sched_change, ...) {
> > > p->sched_class = next_class;
> > > // next_class is still ext_sched_class,
> > > // overwriting fair_sched_class!
> > > }
> > > // Guard ends, calls sched_change_end()
> > > // switching_to_scx() called
> > > // scx_root == NULL => returns early
> > >
> > > task_rq_unlock(p)
> > >
> > > sched_setaffinity(p)
> > > set_cpus_allowed_scx()
> > > sch = scx_root; // scx_root == NULL => BUG!
> >
> > Does the following patch fix the issue?
>
> Nope, I can still trigger this (with the same modified scx_simple +
> stress-ng --race-sched 0:
A quick reproducer:
https://github.com/sched-ext/scx/tree/scx-bug
$ make
$ vng -vr -- "stress-ng --race-sched 0 & ./build/scheds/c/scx_bug"
...
[ 3.375119] BUG: kernel NULL pointer dereference, address: 00000000000001c0
[ 3.375836] RIP: 0010:set_cpus_allowed_scx+0x1a/0xa0
It happens almost immediately.
-Andrea
Powered by blists - more mailing lists