lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <aCkStT1QTGWZ3B50@gpd3>
Date: Sun, 18 May 2025 00:50:29 +0200
From: Andrea Righi <arighi@...dia.com>
To: David Vernet <void@...ifault.com>
Cc: Tejun Heo <tj@...nel.org>, Changwoo Min <changwoo@...lia.com>,
	linux-kernel@...r.kernel.org
Subject: Re: [PATCH 3/4] sched_ext: idle: Allow scx_bpf_select_cpu_and() from
 unlocked context

Hi David,

On Sat, May 17, 2025 at 12:54:29PM -0500, David Vernet wrote:
> On Thu, May 15, 2025 at 09:11:44PM +0200, Andrea Righi wrote:
> > Allow scx_bpf_select_cpu_and() to be used from an unlocked context, in
> > addition to ops.enqueue() or ops.select_cpu().
> > 
> > This enables schedulers, including user-space ones, to implement a
> > consistent idle CPU selection policy and helps reduce code duplication.
> > 
> > Signed-off-by: Andrea Righi <arighi@...dia.com>
> 
> Hey Andrea,
> 
> Nice, this looks correct and reasonable to me. Just left one suggestion below
> that I'd be curious to hear your thoughts on.

Thanks for looking at this!

> 
> > ---
> >  kernel/sched/ext_idle.c | 37 +++++++++++++++++++++++++++----------
> >  1 file changed, 27 insertions(+), 10 deletions(-)
> > 
> > diff --git a/kernel/sched/ext_idle.c b/kernel/sched/ext_idle.c
> > index 716863f1f8cee..37279a09900ca 100644
> > --- a/kernel/sched/ext_idle.c
> > +++ b/kernel/sched/ext_idle.c
> > @@ -922,9 +922,10 @@ __bpf_kfunc s32 scx_bpf_select_cpu_dfl(struct task_struct *p, s32 prev_cpu,
> >   * @cpus_allowed: cpumask of allowed CPUs
> >   * @flags: %SCX_PICK_IDLE* flags
> >   *
> > - * Can only be called from ops.select_cpu() or ops.enqueue() if the
> > - * built-in CPU selection is enabled: ops.update_idle() is missing or
> > - * %SCX_OPS_KEEP_BUILTIN_IDLE is set.
> > + * Can be called from ops.select_cpu(), ops.enqueue(), or from an unlocked
> > + * context such as a BPF test_run() call, as long as built-in CPU selection
> > + * is enabled: ops.update_idle() is missing or %SCX_OPS_KEEP_BUILTIN_IDLE
> > + * is set.
> >   *
> >   * @p, @prev_cpu and @wake_flags match ops.select_cpu().
> >   *
> > @@ -936,6 +937,7 @@ __bpf_kfunc s32 scx_bpf_select_cpu_and(struct task_struct *p, s32 prev_cpu, u64
> >  				       const struct cpumask *cpus_allowed, u64 flags)
> >  {
> >  	struct rq *rq;
> > +	struct rq_flags rf;
> >  	s32 cpu;
> >  
> >  	if (!kf_cpu_valid(prev_cpu, NULL))
> > @@ -944,15 +946,26 @@ __bpf_kfunc s32 scx_bpf_select_cpu_and(struct task_struct *p, s32 prev_cpu, u64
> >  	if (!check_builtin_idle_enabled())
> >  		return -EBUSY;
> >  
> > -	if (!scx_kf_allowed(SCX_KF_SELECT_CPU | SCX_KF_ENQUEUE))
> > -		return -EPERM;
> > +	/*
> > +	 * If called from an unlocked context, acquire the task's rq lock,
> > +	 * so that we can safely access p->cpus_ptr and p->nr_cpus_allowed.
> > +	 *
> > +	 * Otherwise, allow to use this kfunc only from ops.select_cpu()
> > +	 * and ops.select_enqueue().
> > +	 */
> > +	if (scx_kf_allowed_if_unlocked()) {
> > +		rq = task_rq_lock(p, &rf);
> > +	} else {
> > +		if (!scx_kf_allowed(SCX_KF_SELECT_CPU | SCX_KF_ENQUEUE))
> > +			return -EPERM;
> > +		rq = scx_locked_rq();
> > +	}
> >  
> >  	/*
> >  	 * Validate locking correctness to access p->cpus_ptr and
> >  	 * p->nr_cpus_allowed: if we're holding an rq lock, we're safe;
> >  	 * otherwise, assert that p->pi_lock is held.
> >  	 */
> > -	rq = scx_locked_rq();
> >  	if (!rq)
> >  		lockdep_assert_held(&p->pi_lock);
> >  
> > @@ -966,13 +979,17 @@ __bpf_kfunc s32 scx_bpf_select_cpu_and(struct task_struct *p, s32 prev_cpu, u64
> >  	if (p->nr_cpus_allowed == 1) {
> >  		if (cpumask_test_cpu(prev_cpu, cpus_allowed) &&
> >  		    scx_idle_test_and_clear_cpu(prev_cpu))
> > -			return prev_cpu;
> > -		return -EBUSY;
> > +			cpu = prev_cpu;
> > +		else
> > +			cpu = -EBUSY;
> > +	} else {
> > +		cpu = scx_select_cpu_dfl(p, prev_cpu, wake_flags, cpus_allowed, flags);
> 
> I wonder if we should just bring this into scx_select_cpu_dfl()? It seems like
> it would makes sense to do this optimization whether we're looking at
> cpus_allowed here, or p->cpus_ptr in scx_select_cpu_dfl(). I seem to recall us
> having this in there before so there may be a reason we removed it, but I've
> been out of the game for a while so I'm not sure.

Trying to remember... probably it was removed because ops.select_cpu() is
never called for tasks that can only run on 1 CPU.

> 
> Anyways, if we could do this, then we could bring both scx_bpf_select_cpu_and()
> and scx_select_cpu_dfl() into the scx_kfunc_ids_idle kfunc group and remove
> scx_kfunc_ids_select_cpu.
> 
> What do you think?

Are you suggesting that scx_bpf_select_cpu_dfl() should also be allowed in
the same contexts as scx_bpf_select_cpu_and()?

I did consider that, but was initially concerned about the potential
overhead of handling different contexts, even though these extra checks to
manage the contexts would likely be negligible in terms of overhead. And it
would give the possibility to use scx_bpf_select_cpu_dfl() in ops.enqueue()
as well, so overall I see more pros than cons.

Thanks,
-Andrea

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ