[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-Id: <20250827081750.3606616-1-cuiguoqi@kylinos.cn>
Date: Wed, 27 Aug 2025 16:17:50 +0800
From: cuiguoqi <cuiguoqi@...inos.cn>
To: rostedt@...dmis.org
Cc: bigeasy@...utronix.de,
bsegall@...gle.com,
clrkwllms@...nel.org,
cuiguoqi@...inos.cn,
dietmar.eggemann@....com,
guoqi0226@....com,
juri.lelli@...hat.com,
linux-kernel@...r.kernel.org,
linux-rt-devel@...ts.linux.dev,
mgorman@...e.de,
mingo@...hat.com,
peterz@...radead.org,
vincent.guittot@...aro.org,
vschneid@...hat.com
Subject: Re: [PATCH] sched: Fix race in rt_mutex_pre_schedule by removing non-atomic fetch_and_set
The issue arises during EDEADLK testing in `lib/locking-selftest.c` when `is_wait_die=1`.
In this mode, the current thread's `debug_locks` flag is disabled via `__debug_locks_off` (which calls `xchg(&debug_locks, 0)`) during the blocking path of `rt_mutex_slowlock`, specifically in `rt_mutex_slowlock_block()`:
rt_mutex_slowlock()
rt_mutex_pre_schedule()
rt_mutex_slowlock_block()
DEBUG_LOCKS_WARN_ON(ww_ctx->contending_lock)
__debug_locks_off(); // xchg(&debug_locks, 0)
However, `rt_mutex_post_schedule()` still performs:
lockdep_assert(fetch_and_set(current->sched_rt_mutex, 0));
Which expands to:
do {
WARN_ON(debug_locks && !( ({ int _x = current->sched_rt_mutex; current->sched_rt_mutex = 0; _x; }) ));
} while (0)
The generated assembly shows that the entire assertion is conditional on `debug_locks`:
adrp x0, debug_locks
ldr w0, [x0]
cbz w0, .label_skip_warn // Skip WARN if debug_locks == 0
This means: if `debug_locks` was cleared earlier, the check on `current->sched_rt_mutex` is effectively skipped, and the flag may remain set.
Later, when the same task re-enters `rt_mutex_slowlock`, it calls `lockdep_reset()` to re-enable `debug_locks`, but the stale `current->sched_rt_mutex` state (left over from the previous lock attempt) causes a false-positive warning in `rt_mutex_pre_schedule()`:
WARNING: CPU: 0 PID: 0 at kernel/sched/core.c:7085 rt_mutex_pre_schedule+0xa8/0x108
Because:
- `rt_mutex_pre_schedule()` asserts `!current->sched_rt_mutex`
- But the flag was never properly cleared due to the skipped post-schedule check.
This is not a data race on the flag itself, but a **state inconsistency caused by conditional debugging logic** — the `fetch_and_set` macro is not atomic, but more importantly, the assertion is bypassed when `debug_locks` is off, breaking the expected state transition.
Powered by blists - more mailing lists