lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <bylpnof6h3mmeaovba573fer4ikrr5zhr53htbso6zzuw5czzj@tihl7ajm4eaj>
Date: Mon, 12 May 2025 11:58:29 -0500
From: David Vernet <void@...ifault.com>
To: Andrea Righi <arighi@...dia.com>
Cc: Tejun Heo <tj@...nel.org>, Changwoo Min <changwoo@...lia.com>, 
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] sched_ext/idle: Make scx_bpf_select_cpu_and() usable
 from any context

On Mon, May 12, 2025 at 05:14:56PM +0200, Andrea Righi wrote:
> Allow scx_bpf_select_cpu_and() to be used from any context, not just
> from ops.enqueue() or ops.select_cpu().
> 
> This enables schedulers, including user-space ones, to implement a
> consistent idle CPU selection policy and helps reduce code duplication.
> 
> Signed-off-by: Andrea Righi <arighi@...dia.com>

Hi Andrea,

> ---
>  kernel/sched/ext_idle.c | 23 +++++++++++++++++------
>  1 file changed, 17 insertions(+), 6 deletions(-)
> 
> diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> index 6915685cd3d6b..5373132db02b6 100644
> --- a/kernel/sched/ext_idle.c
> +++ b/kernel/sched/ext_idle.c
> @@ -943,10 +943,15 @@ __bpf_kfunc s32 scx_bpf_select_cpu_and(struct task_struct *p, s32 prev_cpu, u64

FYI it looks like there are a few comments sprinkled around here that specify
that only the .select_cpu() and .enqueue() ops are allowed.

>  	if (!check_builtin_idle_enabled())
>  		return -EBUSY;
>  
> -	if (!scx_kf_allowed(SCX_KF_SELECT_CPU | SCX_KF_ENQUEUE))
> -		return -EPERM;
> -
>  #ifdef CONFIG_SMP
> +	/*
> +	 * If called from an unlocked context, try to acquire
> +	 * cpus_read_lock() to avoid races with CPU hotplug.
> +	 */
> +	if (scx_kf_allowed_if_unlocked())
> +		if (!cpus_read_trylock())
> +			return -EBUSY;

Hmm, so this now assumes that this function can be called without the rq lock
held. I'm not sure if that's safe, as scx_select_cpu_dfl() queries p->cpus_ptr,
which is protected by the rq lock. Additionally, scx_bpf_select_cpu_and()
checks p->nr_cpus_allowed which is protected sometimes by pi_lock, sometimes by
the rq lock, etc depending on its state.

This might be fine in practice as I _think_ the only unsafe thing would be if
p->cpus_ptr could have a torn read or write. scx_select_cpu_dfl() does do
preempt_disable() so no issues in querying the per-cpu variables there.

As long as this is safe (or can be made safe) I don't have a strong opinion one
way or the other. I think it's probably a good idea to allow users to remove
code duplication, and in general it's up to the user to use this API correctly
(e.g. if you're preempted during a call to scx_bpf_select_cpu_and() and
prev_cpu changes out from under you due to the task being migrated, that's
likely just a bug in the scheduler).

Thanks,
David

>  	/*
>  	 * This may also be called from ops.enqueue(), so we need to handle
>  	 * per-CPU tasks as well. For these tasks, we can skip all idle CPU
> @@ -956,10 +961,16 @@ __bpf_kfunc s32 scx_bpf_select_cpu_and(struct task_struct *p, s32 prev_cpu, u64
>  	if (p->nr_cpus_allowed == 1) {
>  		if (cpumask_test_cpu(prev_cpu, cpus_allowed) &&
>  		    scx_idle_test_and_clear_cpu(prev_cpu))
> -			return prev_cpu;
> -		return -EBUSY;
> +			cpu = prev_cpu;
> +		else
> +			cpu = -EBUSY;
> +		goto out_unlock;
>  	}
>  	cpu = scx_select_cpu_dfl(p, prev_cpu, wake_flags, cpus_allowed, flags);
> +
> +out_unlock:
> +	if (scx_kf_allowed_if_unlocked())
> +		cpus_read_unlock();
>  #else
>  	cpu = -EBUSY;
>  #endif
> @@ -1266,6 +1277,7 @@ BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu_node, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_idle_cpu, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu_node, KF_RCU)
>  BTF_ID_FLAGS(func, scx_bpf_pick_any_cpu, KF_RCU)
> +BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
>  BTF_KFUNCS_END(scx_kfunc_ids_idle)
>  
>  static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
> @@ -1275,7 +1287,6 @@ static const struct btf_kfunc_id_set scx_kfunc_set_idle = {
>  
>  BTF_KFUNCS_START(scx_kfunc_ids_select_cpu)
>  BTF_ID_FLAGS(func, scx_bpf_select_cpu_dfl, KF_RCU)
> -BTF_ID_FLAGS(func, scx_bpf_select_cpu_and, KF_RCU)
>  BTF_KFUNCS_END(scx_kfunc_ids_select_cpu)
>  
>  static const struct btf_kfunc_id_set scx_kfunc_set_select_cpu = {
> -- 
> 2.49.0
> 

Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ