[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Zxvz3mlxRm-BxuJW@slm.duckdns.org>
Date: Fri, 25 Oct 2024 09:39:10 -1000
From: Tejun Heo <tj@...nel.org>
To: David Vernet <void@...ifault.com>
Cc: sched-ext@...a.com, kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] scx: Fix raciness in scx_ops_bypass()
On Fri, Oct 25, 2024 at 12:40:14AM -0500, David Vernet wrote:
> scx_ops_bypass() can currently race on the ops enable / disable path as
> follows:
>
> 1. scx_ops_bypass(true) called on enable path, bypass depth is set to 1
> 2. An op on the init path exits, which schedules scx_ops_disable_workfn()
> 3. scx_ops_bypass(false) is called on the disable path, and bypass depth
> is decremented to 0
> 4. kthread is scheduled to execute scx_ops_disable_workfn()
> 5. scx_ops_bypass(true) called, bypass depth set to 1
> 6. scx_ops_bypass() races when iterating over CPUs
>
> Fixing this is difficult because we can't take any locks when enabling
> bypass due to us not being able to trust the BPF scheduler. This is
We can't use mutexes but can definitely use raw_spinlocks.
> problematic, because what we really need to do is coordinate between
> possible concurrent calls of scx_ops_bypass(true) and
> scx_ops_bypass(false), but the whole point of that code is that we can't
> use any locks to coordinate. Instead of taking a lock, however, we can
> instead just serialize the calls to enable and disable bypass by executing
> the calls on the scx_ops_helper kthread that's currently responsible for
> disabling a BPF scheduler.
>
> This patch therefore adds a new schedule_scx_bypass_delta() function which
> schedules changes to scx_ops_bypass() to occur on the scx_ops_helper
> kthread (where necessary).
Can't we just add a static raw_spinlock to protect scx_ops_bypass() body and
maybe turn scx_ops_bypass_depth into a regular int while at it?
Thanks.
--
tejun
Powered by blists - more mailing lists