[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <6e82f186-e059-4535-a0f5-848a055058f3@amd.com>
Date: Mon, 4 Aug 2025 10:12:50 +0530
From: K Prateek Nayak <kprateek.nayak@....com>
To: John Stultz <jstultz@...gle.com>, LKML <linux-kernel@...r.kernel.org>
CC: <syzbot+602c4720aed62576cd79@...kaller.appspotmail.com>, Maarten Lankhorst
<maarten.lankhorst@...ux.intel.com>, Ingo Molnar <mingo@...hat.com>, "Peter
Zijlstra" <peterz@...radead.org>, Juri Lelli <juri.lelli@...hat.com>,
"Vincent Guittot" <vincent.guittot@...aro.org>, Dietmar Eggemann
<dietmar.eggemann@....com>, Valentin Schneider <valentin.schneider@....com>,
Suleiman Souhlal <suleiman@...gle.com>, <airlied@...il.com>,
<mripard@...nel.org>, <simona@...ll.ch>, <tzimmermann@...e.de>,
<dri-devel@...ts.freedesktop.org>, <kernel-team@...roid.com>
Subject: Re: [PATCH v2] locking: Fix __clear_task_blocked_on() warning from
__ww_mutex_wound() path
Hello John,
On 8/2/2025 12:51 AM, John Stultz wrote:
> The __clear_task_blocked_on() helper added a number of sanity
> checks ensuring we hold the mutex wait lock and that the task
> we are clearing blocked_on pointer (if set) matches the mutex.
>
> However, there is an edge case in the _ww_mutex_wound() logic
> where we need to clear the blocked_on pointer for the task that
> owns the mutex, not the task that is waiting on the mutex.
>
> For this case the sanity checks aren't valid, so handle this
> by allowing a NULL lock to skip the additional checks.
>
> This was easier to miss, I realized, as the test-ww_mutex
> driver only exercises the wait-die class of ww_mutexes.
>
> I've got a follow up patch to extend the test so that it
> will exercise both.
>
> Fixes: a4f0b6fef4b0 ("locking/mutex: Add p->blocked_on wrappers for correctness checks")
> Reported-by: syzbot+602c4720aed62576cd79@...kaller.appspotmail.com
> Reported-by: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>
> Closes: https://lore.kernel.org/lkml/68894443.a00a0220.26d0e1.0015.GAE@google.com/
> Signed-off-by: John Stultz <jstultz@...gle.com>
I've been running this for a while and haven't seen any splats with
syzbot's C reproducer.
> ---
> v2:
> * Rewording of "lock" to "mutex" in commit and comment for
> clarity
> * Rework __clear_task_blocked_on() to use READ_ONCE and WRITE_ONCE
> so we don't trip over the WARNING if two instances race, as suggested
> by K Prateek Nayak and Maarten Lankhorst
>
> Cc: Ingo Molnar <mingo@...hat.com>
> Cc: Peter Zijlstra <peterz@...radead.org>
> Cc: Juri Lelli <juri.lelli@...hat.com>
> Cc: Vincent Guittot <vincent.guittot@...aro.org>
> Cc: Dietmar Eggemann <dietmar.eggemann@....com>
> Cc: Valentin Schneider <valentin.schneider@....com>
> Cc: K Prateek Nayak <kprateek.nayak@....com>
> Cc: Suleiman Souhlal <suleiman@...gle.com>
> Cc: Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>
> Cc: airlied@...il.com
> Cc: mripard@...nel.org
> Cc: simona@...ll.ch
> Cc: tzimmermann@...e.de
> Cc: dri-devel@...ts.freedesktop.org
> Cc: kernel-team@...roid.com
> ---
> include/linux/sched.h | 23 +++++++++++++----------
> kernel/locking/ww_mutex.h | 6 +++++-
> 2 files changed, 18 insertions(+), 11 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 40d2fa90df425..700b50d29f7fe 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2166,16 +2166,19 @@ static inline void set_task_blocked_on(struct task_struct *p, struct mutex *m)
Should we consider using WRITE_ONCE() in __set_task_blocked_on() and
use a local copy of "blocked_on" there too?
I think a set_task_blocked_on() on a separate ww_mutex can still race
with a wound on the ww_ctx which indiscriminately writes NULL to
"owner->blocked_on" and can possibly lead to a splat for:
WARN_ON_ONCE(p->blocked_on && p->blocked_on != m);
^^^^^^^^^^^^^ ^^^^^^^^^^^^^
Sees p is blocked on "m" Turns NULL as a result
already. of a concurrent wound.
A READ_ONCE() and WRITE_ONCE() in __set_task_blocked_on() should help
solve the splat in this very unlikely case too unless I'm mistaken.
Apart from that, this fix looks good. Feel free to include:
Reviewed-and-tested-by: K Prateek Nayak <kprateek.nayak@....com>
--
Thanks and Regards,
Prateek
>
> static inline void __clear_task_blocked_on(struct task_struct *p, struct mutex *m)
> {
> - WARN_ON_ONCE(!m);
> - /* Currently we serialize blocked_on under the mutex::wait_lock */
> - lockdep_assert_held_once(&m->wait_lock);
> - /*
> - * There may be cases where we re-clear already cleared
> - * blocked_on relationships, but make sure we are not
> - * clearing the relationship with a different lock.
> - */
> - WARN_ON_ONCE(m && p->blocked_on && p->blocked_on != m);
> - p->blocked_on = NULL;
> + if (m) {
> + struct mutex *blocked_on = READ_ONCE(p->blocked_on);
> +
> + /* Currently we serialize blocked_on under the mutex::wait_lock */
> + lockdep_assert_held_once(&m->wait_lock);
> + /*
> + * There may be cases where we re-clear already cleared
> + * blocked_on relationships, but make sure we are not
> + * clearing the relationship with a different lock.
> + */
> + WARN_ON_ONCE(blocked_on && blocked_on != m);
> + }
> + WRITE_ONCE(p->blocked_on, NULL);
> }
Powered by blists - more mailing lists