[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87k00cr7ix.ffs@tglx>
Date: Mon, 20 Feb 2023 10:49:26 +0100
From: Thomas Gleixner <tglx@...utronix.de>
To: Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Crystal Wood <swood@...hat.com>
Cc: John Keeping <john@...anate.com>, linux-rt-users@...r.kernel.org,
linux-kernel@...r.kernel.org
Subject: Re: rtmutex, pi_blocked_on, and blk_flush_plug()
On Thu, Feb 16 2023 at 15:17, Sebastian Andrzej Siewior wrote:
> On 2023-02-09 22:31:57 [-0600], Crystal Wood wrote:
>> It is possible for blk_flush_plug() to be called while
>> current->pi_blocked_on is set, in the process of trying to acquire an rwsem.
>> If the block flush blocks trying to acquire some lock, then it appears that
>> current->pi_blocked_on will be overwritten, and then set to NULL once that
>> lock is acquired, even though the task is still blocked on the original
>> rwsem. Am I missing something that deals with this situation? It seems
>> like the lock types that are supposed to call blk_flush_plug() should do so
>> before calling task_blocks_on_rt_mutex().
>
> Do you experience a problem in v6.1-RT?
>
>> I originally noticed this while investigating a related issue on an older
>> RHEL kernel where task_blocked_on_mutex() has a BUG_ON if entered with
>> current->pi_blocked_on non-NULL. Current kernels lack this check.
>
> The logic is different but the deadlock should be avoided:
> - mutex_t and rw_semaphore invoke schedule() while blocking on a lock.
> As part of schedule() sched_submit_work() is invoked.
> This is the same in RT and !RT so I don't expect any dead lock since
> the involved locks are the same.
Huch?
xlog_cil_commit()
down_read(&cil->xc_ctx_lock)
__rwbase_read_lock()
__rt_mutex_slowlock()
current->pi_blocked_on = ...
schedule()
__blk_flush_plug()
dd_insert_requests()
rt_spin_lock()
WARN_ON(current->pi_blocked_on);
So something like the below is required. But that might not cut it
completely. wq_worker_sleeping() is fine, but I'm not convinced that
io_wq_worker_sleeping() is safe. That needs some investigation.
Thanks,
tglx
---
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6666,6 +6666,9 @@ static inline void sched_submit_work(str
*/
SCHED_WARN_ON(current->__state & TASK_RTLOCK_WAIT);
+ if (current->pi_blocked_on)
+ return;
+
/*
* If we are going to sleep and we have plugged IO queued,
* make sure to submit it to avoid deadlocks.
Powered by blists - more mailing lists