[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <0d477668-e369-24fa-ffd2-1cb560910d2c@redhat.com>
Date: Mon, 19 Sep 2022 15:49:19 -0400
From: Waiman Long <longman@...hat.com>
To: Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...hat.com>,
Will Deacon <will.deacon@....com>,
Boqun Feng <boqun.feng@...il.com>
Cc: linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] locking/semaphore: Use wake_q to wake up processes
outside lock critical section
On 9/9/22 15:28, Waiman Long wrote:
> It was found that a circular lock dependency can happen with the
> following locking sequence:
>
> +--> (console_sem).lock --> &p->pi_lock --> &rq->__lock --+
> | |
> +---------------------------------------------------------+
>
> The &p->pi_lock --> &rq->__lock sequence is very common in all the
> task_rq_lock() calls.
>
> The &rq->__lock --> (console_sem).lock sequence happens when the
> scheduler code calling printk() or more likely the various WARN*()
> macros while holding the rq lock. The (console_sem).lock is actually
> a raw spinlock guarding the semaphore. In the particular lockdep splat
> that I saw, it was caused by SCHED_WARN_ON() call in update_rq_clock().
> To work around this locking sequence, we may have to ban all WARN*()
> calls when the rq lock is held, which may be too restrictive, or we
> may have to add a WARN_DEFERRED() call and modify all the call sites
> to use it.
>
> Even then, a deferred printk or WARN function may still call
> console_trylock() which may, in turn, calls up_console_sem() leading
> to this locking sequence.
>
> The other ((console_sem).lock --> &p->pi_lock) locking sequence
> was caused by the fact that the semaphore up() function is calling
> wake_up_process() while holding the semaphore raw spinlock. This lockiing
> sequence can be easily eliminated by moving the wake_up_processs()
> call out of the raw spinlock critical section using wake_q which is
> what this patch implements. That is the easiest and the most certain
> way to break this circular locking sequence.
>
> v1: https://lore.kernel.org/lkml/20220118153254.358748-1-longman@redhat.com/
>
> Signed-off-by: Waiman Long <longman@...hat.com>
Ping!
Note that the current printk_deferred() code path may also hit this
problem as an up() call of console_sem may be issued.
Cheers,
Longman
Powered by blists - more mailing lists