lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230322162719.wYG1N0hh@linutronix.de>
Date:   Wed, 22 Mar 2023 17:27:19 +0100
From:   Sebastian Andrzej Siewior <bigeasy@...utronix.de>
To:     Thomas Gleixner <tglx@...utronix.de>, linux-kernel@...r.kernel.org
Cc:     Crystal Wood <swood@...hat.com>, John Keeping <john@...anate.com>,
        Boqun Feng <boqun.feng@...il.com>,
        Ingo Molnar <mingo@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Waiman Long <longman@...hat.com>, Will Deacon <will@...nel.org>
Subject: [PATCH] locking/rtmutex: Flush the plug before entering the slowpath.

blk_flush_plug() is invoked on schedule() to flush out the IO progress
that has been made so far so that it is globally visible. This is
important to avoid deadlocks because a lock owner can wait for IO.
Therefore the IO must be first flushed before a thread can block on a
lock.

The plug flush routine can acquire a sleeping lock which is contended.
Blocking on a lock requires an assignment to task_struct::pi_blocked_on.
If blk_flush_plug() is invoked from the slow path on schedule() then the
variable is already set and will be overwritten by the lock in
blk_flush_plug().
Therefore it is needed to invoke blk_flush_plug() (and block on
potential locks in the process) before the blocking on the actual lock.

Invoke blk_flush_plug() before blocking on a sleeping lock. The
PREEMPT_RT only sleeping locks (spinlock_t and rwlock_t) are excluded
because their slow path does not invoke blk_flush_plug().

Fixes: e17ba59b7e8e1 ("locking/rtmutex: Guard regular sleeping locks specific functions")
Reported-by: Crystal Wood <swood@...hat.com>
Link: https://lore.kernel.org/4b4ab374d3e24e6ea8df5cadc4297619a6d945af.camel@redhat.com
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@...utronix.de>
---
On 2023-02-20 19:21:51 [+0100], Thomas Gleixner wrote:
> 
> This still leaves the problem vs. io_wq_worker_sleeping() and it's
> running() counterpart after schedule().

io_wq_worker_sleeping() has a kfree() so it probably should be moved,
too.
io_wq_worker_running() is a OR and INC and is fine.

> Aside of that for CONFIG_DEBUG_RT_MUTEXES=y builds it flushes on every
> lock operation whether the lock is contended or not.

For mutex & ww_mutex operations. rwsem is not affected by
CONFIG_DEBUG_RT_MUTEXES. As for mutex it could be mitigated by invoking
try_to_take_rt_mutex() before blk_flush_plug().

 kernel/locking/rtmutex.c     | 7 +++++++
 kernel/locking/rwbase_rt.c   | 8 ++++++++
 kernel/locking/ww_rt_mutex.c | 5 +++++
 3 files changed, 20 insertions(+)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 728f434de2bbf..c1bc2cb1522cb 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -23,6 +23,7 @@
 #include <linux/sched/rt.h>
 #include <linux/sched/wake_q.h>
 #include <linux/ww_mutex.h>
+#include <linux/blkdev.h>
 
 #include <trace/events/lock.h>
 
@@ -1700,6 +1701,12 @@ static __always_inline int __rt_mutex_lock(struct rt_mutex_base *lock,
 	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
 		return 0;
 
+	/*
+	 * If we are going to sleep and we have plugged IO queued, make sure to
+	 * submit it to avoid deadlocks.
+	 */
+	blk_flush_plug(current->plug, true);
+
 	return rt_mutex_slowlock(lock, NULL, state);
 }
 #endif /* RT_MUTEX_BUILD_MUTEX */
diff --git a/kernel/locking/rwbase_rt.c b/kernel/locking/rwbase_rt.c
index c201aadb93017..70c08ec4ad8af 100644
--- a/kernel/locking/rwbase_rt.c
+++ b/kernel/locking/rwbase_rt.c
@@ -143,6 +143,14 @@ static __always_inline int rwbase_read_lock(struct rwbase_rt *rwb,
 	if (rwbase_read_trylock(rwb))
 		return 0;
 
+	if (state != TASK_RTLOCK_WAIT) {
+		/*
+		 * If we are going to sleep and we have plugged IO queued,
+		 * make sure to submit it to avoid deadlocks.
+		 */
+		blk_flush_plug(current->plug, true);
+	}
+
 	return __rwbase_read_lock(rwb, state);
 }
 
diff --git a/kernel/locking/ww_rt_mutex.c b/kernel/locking/ww_rt_mutex.c
index d1473c624105c..472e3622abf09 100644
--- a/kernel/locking/ww_rt_mutex.c
+++ b/kernel/locking/ww_rt_mutex.c
@@ -67,6 +67,11 @@ __ww_rt_mutex_lock(struct ww_mutex *lock, struct ww_acquire_ctx *ww_ctx,
 			ww_mutex_set_context_fastpath(lock, ww_ctx);
 		return 0;
 	}
+	/*
+	 * If we are going to sleep and we have plugged IO queued, make sure to
+	 * submit it to avoid deadlocks.
+	 */
+	blk_flush_plug(current->plug, true);
 
 	ret = rt_mutex_slowlock(&rtm->rtmutex, ww_ctx, state);
 
-- 
2.40.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ