lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date:   Wed, 20 Oct 2021 07:06:38 -0400
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc:     Davidlohr Bueso <dave@...olabs.net>,
        Oleg Nesterov <oleg@...hat.com>,
        Ingo Molnar <mingo@...nel.org>,
        "Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Wanpeng Li <wanpengli@...cent.com>
Subject: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed

In some cases, rcuwait_wake_up can be called even if the actual likelihood
of a wakeup is very low.  If CONFIG_PREEMPT_RCU is active, the resulting
rcu_read_lock/rcu_read_unlock pair can be relatively expensive, and in
fact it is unnecessary when there is no w->task to keep alive: the
memory barrier before the read is what matters in order to avoid missed
wakeups.

Therefore, do an early check of w->task right after the barrier, and skip
rcu_read_lock/rcu_read_unlock unless there is someone waiting for a wakeup.

Running kvm-unit-test/vmexit.flat with APICv disabled, most interrupt
injection tests (tscdeadline*, self_ipi*, x2apic_self_ipi*) improve
by around 600 cpu cycles.

Cc: Davidlohr Bueso <dave@...olabs.net>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Reported-by: Wanpeng Li <wanpengli@...cent.com>
Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
---
 kernel/exit.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 91a43e57a32e..a38a08dbf85e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -234,8 +234,6 @@ int rcuwait_wake_up(struct rcuwait *w)
 	int ret = 0;
 	struct task_struct *task;
 
-	rcu_read_lock();
-
 	/*
 	 * Order condition vs @task, such that everything prior to the load
 	 * of @task is visible. This is the condition as to why the user called
@@ -245,10 +243,22 @@ int rcuwait_wake_up(struct rcuwait *w)
 	 *    WAIT                WAKE
 	 *    [S] tsk = current	  [S] cond = true
 	 *        MB (A)	      MB (B)
-	 *    [L] cond		  [L] tsk
+	 *    [L] cond		  [L] rcuwait_active(w)
+	 *                            task = rcu_dereference(w->task)
 	 */
 	smp_mb(); /* (B) */
 
+#ifdef CONFIG_PREEMPT_RCU
+	/*
+	 * The cost of rcu_read_lock() dominates for preemptible RCU,
+	 * avoid it if possible.
+	 */
+	if (!rcuwait_active(w))
+		return ret;
+#endif
+
+	rcu_read_lock();
+
 	task = rcu_dereference(w->task);
 	if (task)
 		ret = wake_up_process(task);
-- 
2.27.0

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ