[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-Id: <20211020110638.797389-1-pbonzini@redhat.com>
Date: Wed, 20 Oct 2021 07:06:38 -0400
From: Paolo Bonzini <pbonzini@...hat.com>
To: linux-kernel@...r.kernel.org, kvm@...r.kernel.org
Cc: Davidlohr Bueso <dave@...olabs.net>,
Oleg Nesterov <oleg@...hat.com>,
Ingo Molnar <mingo@...nel.org>,
"Paul E . McKenney" <paulmck@...ux.vnet.ibm.com>,
Peter Zijlstra <peterz@...radead.org>,
Wanpeng Li <wanpengli@...cent.com>
Subject: [PATCH] rcuwait: do not enter RCU protection unless a wakeup is needed
In some cases, rcuwait_wake_up can be called even if the actual likelihood
of a wakeup is very low. If CONFIG_PREEMPT_RCU is active, the resulting
rcu_read_lock/rcu_read_unlock pair can be relatively expensive, and in
fact it is unnecessary when there is no w->task to keep alive: the
memory barrier before the read is what matters in order to avoid missed
wakeups.
Therefore, do an early check of w->task right after the barrier, and skip
rcu_read_lock/rcu_read_unlock unless there is someone waiting for a wakeup.
Running kvm-unit-test/vmexit.flat with APICv disabled, most interrupt
injection tests (tscdeadline*, self_ipi*, x2apic_self_ipi*) improve
by around 600 cpu cycles.
Cc: Davidlohr Bueso <dave@...olabs.net>
Cc: Oleg Nesterov <oleg@...hat.com>
Cc: Ingo Molnar <mingo@...nel.org>
Cc: Paul E. McKenney <paulmck@...ux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@...radead.org>
Reported-by: Wanpeng Li <wanpengli@...cent.com>
Signed-off-by: Paolo Bonzini <pbonzini@...hat.com>
---
kernel/exit.c | 16 +++++++++++++---
1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/kernel/exit.c b/kernel/exit.c
index 91a43e57a32e..a38a08dbf85e 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -234,8 +234,6 @@ int rcuwait_wake_up(struct rcuwait *w)
int ret = 0;
struct task_struct *task;
- rcu_read_lock();
-
/*
* Order condition vs @task, such that everything prior to the load
* of @task is visible. This is the condition as to why the user called
@@ -245,10 +243,22 @@ int rcuwait_wake_up(struct rcuwait *w)
* WAIT WAKE
* [S] tsk = current [S] cond = true
* MB (A) MB (B)
- * [L] cond [L] tsk
+ * [L] cond [L] rcuwait_active(w)
+ * task = rcu_dereference(w->task)
*/
smp_mb(); /* (B) */
+#ifdef CONFIG_PREEMPT_RCU
+ /*
+ * The cost of rcu_read_lock() dominates for preemptible RCU,
+ * avoid it if possible.
+ */
+ if (!rcuwait_active(w))
+ return ret;
+#endif
+
+ rcu_read_lock();
+
task = rcu_dereference(w->task);
if (task)
ret = wake_up_process(task);
--
2.27.0
Powered by blists - more mailing lists