linux-kernel - Re: [PATCH 2/2] scx: Fix raciness in scx_ops

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <Zxvz3mlxRm-BxuJW@slm.duckdns.org>
Date: Fri, 25 Oct 2024 09:39:10 -1000
From: Tejun Heo <tj@...nel.org>
To: David Vernet <void@...ifault.com>
Cc: sched-ext@...a.com, kernel-team@...a.com, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 2/2] scx: Fix raciness in scx_ops_bypass()

On Fri, Oct 25, 2024 at 12:40:14AM -0500, David Vernet wrote:
> scx_ops_bypass() can currently race on the ops enable / disable path as
> follows:
> 
> 1. scx_ops_bypass(true) called on enable path, bypass depth is set to 1
> 2. An op on the init path exits, which schedules scx_ops_disable_workfn()
> 3. scx_ops_bypass(false) is called on the disable path, and bypass depth
>    is decremented to 0
> 4. kthread is scheduled to execute scx_ops_disable_workfn()
> 5. scx_ops_bypass(true) called, bypass depth set to 1
> 6. scx_ops_bypass() races when iterating over CPUs
> 
> Fixing this is difficult because we can't take any locks when enabling
> bypass due to us not being able to trust the BPF scheduler. This is

We can't use mutexes but can definitely use raw_spinlocks.

> problematic, because what we really need to do is coordinate between
> possible concurrent calls of scx_ops_bypass(true) and
> scx_ops_bypass(false), but the whole point of that code is that we can't
> use any locks to coordinate. Instead of taking a lock, however, we can
> instead just serialize the calls to enable and disable bypass by executing
> the calls on the scx_ops_helper kthread that's currently responsible for
> disabling a BPF scheduler.
> 
> This patch therefore adds a new schedule_scx_bypass_delta() function which
> schedules changes to scx_ops_bypass() to occur on the scx_ops_helper
> kthread (where necessary).

Can't we just add a static raw_spinlock to protect scx_ops_bypass() body and
maybe turn scx_ops_bypass_depth into a regular int while at it?

Thanks.

-- 
tejun