lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220318161609.1939957-3-longman@redhat.com>
Date:   Fri, 18 Mar 2022 12:16:09 -0400
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>, Will Deacon <will@...nel.org>,
        Boqun Feng <boqun.feng@...il.com>
Cc:     linux-kernel@...r.kernel.org, Waiman Long <longman@...hat.com>
Subject: [PATCH 2/2] locking/rwsem: Wake readers in a reader-owned rwsem if first waiter is a reader

In an analysis of a recent vmcore, a reader-owned rwsem was found with
385 readers but no writer in the wait queue. That is kind of unusual
but it may be caused by some race conditions that we have not fully
understood yet. In such a case, all the readers in the wait queue should
join the other reader-owners and acquire the read lock.

In rwsem_down_write_slowpath(), an incoming writer will try to wake
up the front readers under such circumstance. That is not the case for
rwsem_down_read_slowpath(), modify the code to do this. This includes the
original supported case where the wait queue is empty and the incoming
reader is going to wake up itself.

With CONFIG_LOCK_EVENT_COUNTS enabled, the newly added rwsem_rlock_rwake
event counter had 13 hits right after the bootup of a 2-socket system. So
the condition that a reader-owned rwsem has readers at the front of
the wait queue does happen pretty frequently. This patch will help to
speed thing up in such cases.

Signed-off-by: Waiman Long <longman@...hat.com>
---
 kernel/locking/lock_events_list.h |  1 +
 kernel/locking/rwsem.c            | 19 +++++++++++++------
 2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/kernel/locking/lock_events_list.h b/kernel/locking/lock_events_list.h
index 97fb6f3f840a..9bb9f048848b 100644
--- a/kernel/locking/lock_events_list.h
+++ b/kernel/locking/lock_events_list.h
@@ -64,6 +64,7 @@ LOCK_EVENT(rwsem_rlock_steal)	/* # of read locks by lock stealing	*/
 LOCK_EVENT(rwsem_rlock_fast)	/* # of fast read locks acquired	*/
 LOCK_EVENT(rwsem_rlock_fail)	/* # of failed read lock acquisitions	*/
 LOCK_EVENT(rwsem_rlock_handoff)	/* # of read lock handoffs		*/
+LOCK_EVENT(rwsem_rlock_rwake)	/* # of readers wakeup in slow path	*/
 LOCK_EVENT(rwsem_wlock)		/* # of write locks acquired		*/
 LOCK_EVENT(rwsem_wlock_fail)	/* # of failed write lock acquisitions	*/
 LOCK_EVENT(rwsem_wlock_handoff)	/* # of write lock handoffs		*/
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index f71a9693d05a..53f7f0b4724a 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -997,17 +997,24 @@ rwsem_down_read_slowpath(struct rw_semaphore *sem, long count, unsigned int stat
 	count = atomic_long_add_return(adjustment, &sem->count);
 
 	/*
-	 * If there are no active locks, wake the front queued process(es).
-	 *
-	 * If there are no writers and we are first in the queue,
-	 * wake our own waiter to join the existing active readers !
+	 * Do a rwsem_mark_wake() under one of the following conditions:
+	 * 1) there is no active read or write lock.
+	 * 2) there is no writer-owner (can be reader-owned) and the first
+	 *    waiter is a reader.
 	 */
 	if (!(count & RWSEM_LOCK_MASK)) {
 		clear_nonspinnable(sem);
 		wake = true;
+	} else if (!(count & RWSEM_WRITER_MASK)) {
+		wake = rwsem_first_waiter(sem)->type == RWSEM_WAITING_FOR_READ;
+		/*
+		 * Check the number of cases where readers at the front
+		 * of the previously non-empty wait list are to be woken.
+		 */
+		lockevent_cond_inc(rwsem_rlock_rwake,
+				   wake && !(adjustment & RWSEM_FLAG_WAITERS));
 	}
-	if (wake || (!(count & RWSEM_WRITER_MASK) &&
-		    (adjustment & RWSEM_FLAG_WAITERS)))
+	if (wake)
 		rwsem_mark_wake(sem, RWSEM_WAKE_ANY, &wake_q);
 
 	raw_spin_unlock_irq(&sem->wait_lock);
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ