linux-kernel - Re: [PATCH 02/12] sched_ext: Avoid NULL scx_root deref through SCX_HAS

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <b9814fec-a9b6-4cd5-a0b1-1c2ddb214a03@linux.dev>
Date: Thu, 24 Apr 2025 15:23:40 +0800
From: Chengming Zhou <chengming.zhou@...ux.dev>
To: Tejun Heo <tj@...nel.org>, David Vernet <void@...ifault.com>,
 Andrea Righi <arighi@...dia.com>, Changwoo Min <changwoo@...lia.com>,
 linux-kernel@...r.kernel.org
Subject: Re: [PATCH 02/12] sched_ext: Avoid NULL scx_root deref through
 SCX_HAS_OP()

On 2025/4/24 07:44, Tejun Heo wrote:
> SCX_HAS_OP() tests scx_root->has_op bitmap. The bitmap is currently in a
> statically allocated struct scx_sched and initialized while loading the BPF
> scheduler and cleared while unloading, and thus can be tested anytime.
> However, scx_root will be switched to dynamic allocation and thus won't
> always be deferenceable.
> 
> Most usages of SCX_HAS_OP() are already protected by scx_enabled() either
> directly or indirectly (e.g. through a task which is on SCX). However, there
> are a couple places that could try to dereference NULL scx_root. Update them
> so that scx_root is guaranteed to be valid before SCX_HAS_OP() is called.
> 
> - In handle_hotplug(), test whether scx_root is NULL before doing anything
>    else. This is safe because scx_root updates will be protected by
>    cpus_read_lock().
> 
> - In scx_tg_offline(), test scx_cgroup_enabled before invoking SCX_HAS_OP(),
>    which should guarnatee that scx_root won't turn NULL. This is also in line
>    with other cgroup operations. As the code path is synchronized against
>    scx_cgroup_init/exit() through scx_cgroup_rwsem, this shouldn't cause any
>    behavior differences.
> 
> Signed-off-by: Tejun Heo <tj@...nel.org>
> ---
>   kernel/sched/ext.c | 11 ++++++++++-
>   1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sched/ext.c b/kernel/sched/ext.c
> index 975f6963a01b..ad392890d2dd 100644
> --- a/kernel/sched/ext.c
> +++ b/kernel/sched/ext.c
> @@ -3498,6 +3498,14 @@ static void handle_hotplug(struct rq *rq, bool online)
>   
>   	atomic_long_inc(&scx_hotplug_seq);
>   
> +	/*
> +	 * scx_root updates are protected by cpus_read_lock() and will stay
> +	 * stable here. Note that we can't depend on scx_enabled() test as the
> +	 * hotplug ops need to be enabled before __scx_enabled is set.
> +	 */
> +	if (!scx_root)
> +		return;
> +
>   	if (scx_enabled())
>   		scx_idle_update_selcpu_topology(&scx_root->ops);

Just be curious, does the comments added above mean we shouldn't
check scx_enabled() here anymore?

Thanks!

>   
> @@ -3994,7 +4002,8 @@ void scx_tg_offline(struct task_group *tg)
>   
>   	percpu_down_read(&scx_cgroup_rwsem);
>   
> -	if (SCX_HAS_OP(scx_root, cgroup_exit) && (tg->scx_flags & SCX_TG_INITED))
> +	if (scx_cgroup_enabled && SCX_HAS_OP(scx_root, cgroup_exit) &&
> +	    (tg->scx_flags & SCX_TG_INITED))
>   		SCX_CALL_OP(SCX_KF_UNLOCKED, cgroup_exit, NULL, tg->css.cgroup);
>   	tg->scx_flags &= ~(SCX_TG_ONLINE | SCX_TG_INITED);
>