[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251006190739.GZ3245006@noisy.programming.kicks-ass.net>
Date: Mon, 6 Oct 2025 21:07:39 +0200
From: Peter Zijlstra <peterz@...radead.org>
To: cuiguoqi <cuiguoqi@...inos.cn>
Cc: rostedt@...dmis.org, bigeasy@...utronix.de, bsegall@...gle.com,
clrkwllms@...nel.org, dietmar.eggemann@....com, guoqi0226@....com,
juri.lelli@...hat.com, linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev, mgorman@...e.de, mingo@...hat.com,
vincent.guittot@...aro.org, vschneid@...hat.com
Subject: Re: [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing
non-atomic fetch_and_set
On Wed, Aug 27, 2025 at 04:17:50PM +0800, cuiguoqi wrote:
> The issue arises during EDEADLK testing in `lib/locking-selftest.c` when `is_wait_die=1`.
>
> In this mode, the current thread's `debug_locks` flag is disabled via `__debug_locks_off` (which calls `xchg(&debug_locks, 0)`) during the blocking path of `rt_mutex_slowlock`, specifically in `rt_mutex_slowlock_block()`:
>
> rt_mutex_slowlock()
> rt_mutex_pre_schedule()
> rt_mutex_slowlock_block()
> DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock)
> __debug_locks_off(); // xchg(&debug_locks, 0)
>
> However, `rt_mutex_post_schedule()` still performs:
>
> lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
>
> Which expands to:
>
> do {
> WARN_ON(debug_locks && !( ({ int _x = current->sched_rt_mutex; current->sched_rt_mutex = 0; _x; }) ));
> } while (0)
>
> The generated assembly shows that the entire assertion is conditional on `debug_locks`:
>
> adrp x0, debug_locks
> ldr w0, [x0]
> cbz w0, .label_skip_warn // Skip WARN if debug_locks == 0
>
> This means: if `debug_locks` was cleared earlier, the check on `current->sched_rt_mutex` is effectively skipped, and the flag may remain set.
>
> Later, when the same task re-enters `rt_mutex_slowlock`, it calls `lockdep_reset()` to re-enable `debug_locks`, but the stale `current->sched_rt_mutex` state (left over from the previous lock attempt) causes a false-positive warning in `rt_mutex_pre_schedule()`:
>
> WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:7085 rt_mutex_pre_schedule+0xa8/0x108
>
> Because:
> - `rt_mutex_pre_schedule()` asserts `!current->sched_rt_mutex`
> - But the flag was never properly cleared due to the skipped post-schedule check.
>
> This is not a data race on the flag itself, but a **state inconsistency caused by conditional debugging logic** — the `fetch_and_set` macro is not atomic, but more importantly, the assertion is bypassed when `debug_locks` is off, breaking the expected state transition.
Yeah, I can't really make myself care too much. This means you've
already had errors before -- resulting in debug_locks getting cleared.
Fix those and this problem goes away.
debug_locks is inherently racy; I don't see value in trying to fix all
that.
Powered by blists - more mailing lists