linux-kernel - [PATCH v2] locking/rwsem: Take read lock immediate if queue empty with no writer

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [thread-next>] [day] [month] [year] [list]

Date:   Fri, 13 Jul 2018 14:30:53 -0400
From:   Waiman Long <longman@...hat.com>
To:     Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...hat.com>,
        Will Deacon <will.deacon@....com>
Cc:     linux-kernel@...r.kernel.org, Mark Ray <mark.ray@....com>,
        Joe Mario <jmario@...hat.com>,
        Scott Norton <scott.norton@....com>,
        Waiman Long <longman@...hat.com>
Subject: [PATCH v2] locking/rwsem: Take read lock immediate if queue empty with no writer

It was discovered that a constant stream of readers might cause the
count to go negative most of the time after an initial trigger by a
writer even if no writer was present afterward. As a result, most of the
readers would have to go through the slowpath reducing their performance.

To avoid that from happening, an additional check is added to detect
the special case that the reader in the critical section is the only
one in the wait queue and no writer is present. When that happens, it
can just have the lock and return immediately without further action.
Other incoming readers won't see a waiter is present and be forced
into the slowpath.

The additional code is in the slowpath and so should not have an impact
on rwsem performance. However, in the special case listed above, it may
greatly improve performance.

The issue was found in a customer site where they had an application
that pounded on the pread64 syscalls heavily on an XFS filesystem. The
application was run in a recent 4-socket boxes with a lot of CPUs. They
saw significant spinlock contention in the rwsem_down_read_failed() call.
With this patch applied, the system CPU usage went from 85% to 57%,
and the spinlock contention in the pread64 syscalls was gone.

v2: Add customer testing results and remove wording that may cause
    confusion.

Signed-off-by: Waiman Long <longman@...hat.com>
---
 kernel/locking/rwsem-xadd.c | 13 ++++++++++++-
 1 file changed, 12 insertions(+), 1 deletion(-)

diff --git a/kernel/locking/rwsem-xadd.c b/kernel/locking/rwsem-xadd.c
index 3064c50..bf0570e 100644
--- a/kernel/locking/rwsem-xadd.c
+++ b/kernel/locking/rwsem-xadd.c
@@ -233,8 +233,19 @@ static void __rwsem_mark_wake(struct rw_semaphore *sem,
 	waiter.type = RWSEM_WAITING_FOR_READ;

 	raw_spin_lock_irq(&sem->wait_lock);
-	if (list_empty(&sem->wait_list))
+	if (list_empty(&sem->wait_list)) {
+		/*
+		 * In the unlikely event that the task is the only one in
+		 * the wait queue and a writer isn't present, it can have
+		 * the lock and return immediately without going through
+		 * the remaining slowpath code.
+		 */
+		if (unlikely(atomic_long_read(&sem->count) >= 0)) {
+			raw_spin_unlock_irq(&sem->wait_lock);
+			return sem;
+		}
 		adjustment += RWSEM_WAITING_BIAS;
+	}
 	list_add_tail(&waiter.list, &sem->wait_list);

 	/* we're now waiting on the lock, but no longer actively locking */
-- 
1.8.3.1