[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20241025194728.GA2648@maniforge>
Date: Fri, 25 Oct 2024 14:47:28 -0500
From: David Vernet <void@...ifault.com>
To: Tejun Heo <tj@...nel.org>
Cc: sched-ext@...a.com, kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] scx: Fix raciness in scx_ops_bypass()
On Fri, Oct 25, 2024 at 09:39:10AM -1000, Tejun Heo wrote:
> On Fri, Oct 25, 2024 at 12:40:14AM -0500, David Vernet wrote:
> > scx_ops_bypass() can currently race on the ops enable / disable path as
> > follows:
> >
> > 1. scx_ops_bypass(true) called on enable path, bypass depth is set to 1
> > 2. An op on the init path exits, which schedules scx_ops_disable_workfn()
> > 3. scx_ops_bypass(false) is called on the disable path, and bypass depth
> > is decremented to 0
> > 4. kthread is scheduled to execute scx_ops_disable_workfn()
> > 5. scx_ops_bypass(true) called, bypass depth set to 1
> > 6. scx_ops_bypass() races when iterating over CPUs
> >
> > Fixing this is difficult because we can't take any locks when enabling
> > bypass due to us not being able to trust the BPF scheduler. This is
>
> We can't use mutexes but can definitely use raw_spinlocks.
Right, fair enough, just need to use a lock where the holder can't be
preempted.
>
> > problematic, because what we really need to do is coordinate between
> > possible concurrent calls of scx_ops_bypass(true) and
> > scx_ops_bypass(false), but the whole point of that code is that we can't
> > use any locks to coordinate. Instead of taking a lock, however, we can
> > instead just serialize the calls to enable and disable bypass by executing
> > the calls on the scx_ops_helper kthread that's currently responsible for
> > disabling a BPF scheduler.
> >
> > This patch therefore adds a new schedule_scx_bypass_delta() function which
> > schedules changes to scx_ops_bypass() to occur on the scx_ops_helper
> > kthread (where necessary).
>
> Can't we just add a static raw_spinlock to protect scx_ops_bypass() body and
> maybe turn scx_ops_bypass_depth into a regular int while at it?
Yeah that's of course a lot simpler, will send a v2.
Download attachment "signature.asc" of type "application/pgp-signature" (229 bytes)
Powered by blists - more mailing lists