[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CANDhNCo0W6cYhVQm7TQso=E9evhYy2oxSLnVz-KxbOdfomZFgQ@mail.gmail.com>
Date: Fri, 13 Dec 2024 18:39:45 -0800
From: John Stultz <jstultz@...gle.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: LKML <linux-kernel@...r.kernel.org>, Ingo Molnar <mingo@...hat.com>,
Will Deacon <will@...nel.org>, Waiman Long <longman@...hat.com>, Boqun Feng <boqun.feng@...il.com>,
Thomas Gleixner <tglx@...utronix.de>, Bert Karwatzki <spasswolf@....de>, kernel-team@...roid.com
Subject: Re: [RFC][PATCH] locking/rtmutex: Make sure we wake anything on the
wake_q when we release the lock->wait_lock
On Fri, Dec 13, 2024 at 4:46 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Thu, Dec 12, 2024 at 02:21:33PM -0800, John Stultz wrote:
> > Bert reported seeing occasional boot hangs when running with
> > PREEPT_RT and bisected it down to commit 894d1b3db41c
> > ("locking/mutex: Remove wakeups from under mutex::wait_lock").
> >
> > It looks like I missed a few spots where we drop the wait_lock and
> > potentially call into schedule without waking up the tasks on the
> > wake_q structure. Since the tasks being woken are ww_mutex tasks
> > they need to be able to run to release the mutex and unblock the
> > task that currently is planning to wake them. Thus we can deadlock.
> >
> > So make sure we wake the wake_q tasks when we unlock the wait_lock.
> >
> > Cc: Peter Zijlstra <peterz@...radead.org>
> > Cc: Ingo Molnar <mingo@...hat.com>
> > Cc: Will Deacon <will@...nel.org>
> > Cc: Waiman Long <longman@...hat.com>
> > Cc: Boqun Feng <boqun.feng@...il.com>
> > Cc: Thomas Gleixner <tglx@...utronix.de>
> > Cc: Bert Karwatzki <spasswolf@....de>
> > Cc: kernel-team@...roid.com
> > Reported-by: Bert Karwatzki <spasswolf@....de>
> > Closes: https://lore.kernel.org/lkml/20241211182502.2915-1-spasswolf@web.de
> > Fixes: 894d1b3db41c ("locking/mutex: Remove wakeups from under mutex::wait_lock")
> > Signed-off-by: John Stultz <jstultz@...gle.com>
> > ---
>
> I don't suppose this actually makes things much better -- but I had to
> try.
>
>
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1192,6 +1192,17 @@ try_to_take_rt_mutex(struct rt_mutex_bas
> return 1;
> }
>
> +#define WRAP_WAKE(_stmt, _q) \
> +do { \
> + struct wake_q_head *_Q = (_q); \
> + guard(preempt)(); \
> + _stmt; \
> + if (_Q && !wake_q_empty(_Q)) { \
> + wake_up_q(_Q); \
> + wake_q_init(_Q); \
> + } \
> +} while (0)
> +
> /*
> * Task blocks on lock.
> *
> @@ -1295,13 +1303,7 @@ static int __sched task_blocks_on_rt_mut
> */
> get_task_struct(owner);
>
> - preempt_disable();
> - raw_spin_unlock_irq(&lock->wait_lock);
> - /* wake up any tasks on the wake_q before calling rt_mutex_adjust_prio_chain */
> - wake_up_q(wake_q);
> - wake_q_init(wake_q);
> - preempt_enable();
> -
> + WRAP_WAKE(raw_spin_unlock_irq(&lock->wait_lock), wake_q);
I worry this obscures the _stmt action and makes it a bit harder to read.
I think all of the calls are tied to the unlock (the one you quoted
earlier was removed with 82f9cc094975240), so would something like a
special unlock be reasonable:
raw_spin_unlock_irq_and_wake(&lock->wait_lock, wake_q)
?
thanks
-john
Powered by blists - more mailing lists