[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-Id: <20220111153539.2532246-4-mark.rutland@arm.com>
Date: Tue, 11 Jan 2022 15:35:37 +0000
From: Mark Rutland <mark.rutland@....com>
To: linux-kernel@...r.kernel.org
Cc: aleksandar.qemu.devel@...il.com, alexandru.elisei@....com,
anup.patel@....com, aou@...s.berkeley.edu, atish.patra@....com,
benh@...nel.crashing.org, borntraeger@...ux.ibm.com, bp@...en8.de,
catalin.marinas@....com, chenhuacai@...nel.org,
dave.hansen@...ux.intel.com, david@...hat.com,
frankja@...ux.ibm.com, frederic@...nel.org, gor@...ux.ibm.com,
hca@...ux.ibm.com, imbrenda@...ux.ibm.com, james.morse@....com,
jmattson@...gle.com, joro@...tes.org, kvm@...r.kernel.org,
mark.rutland@....com, maz@...nel.org, mingo@...hat.com,
mpe@...erman.id.au, nsaenzju@...hat.com, palmer@...belt.com,
paulmck@...nel.org, paulus@...ba.org, paul.walmsley@...ive.com,
pbonzini@...hat.com, seanjc@...gle.com, suzuki.poulose@....com,
tglx@...utronix.de, tsbogend@...ha.franken.de, vkuznets@...hat.com,
wanpengli@...cent.com, will@...nel.org
Subject: [PATCH 3/5] kvm/mips: rework guest entry logic
In kvm_arch_vcpu_ioctl_run() we use guest_enter_irqoff() and
guest_exit_irqoff() directly, with interrupts masked between these. As
we don't handle any timer ticks during this window, we will not account
time spent within the guest as guest time, which is unfortunate.
Additionally, we do not inform lockdep or tracing that interrupts will
be enabled during guest execution, which caan lead to misleading traces
and warnings that interrupts have been enabled for overly-long periods.
This patch fixes these issues by using the new timing and context
entry/exit helpers to ensure that interrupts are handled during guest
vtime but with RCU watching, with a sequence:
guest_timing_enter_irqoff();
exit_to_guest_mode();
< run the vcpu >
enter_from_guest_mode();
< take any pending IRQs >
guest_timing_exit_irqoff();
Since instrumentation may make use of RCU, we must also ensure that no
instrumented code is run during the EQS. I've split out the critical
section into a new kvm_mips_enter_exit_vcpu() helper which is marked
noinstr.
Signed-off-by: Mark Rutland <mark.rutland@....com>
Cc: Aleksandar Markovic <aleksandar.qemu.devel@...il.com>
Cc: Frederic Weisbecker <frederic@...nel.org>
Cc: Huacai Chen <chenhuacai@...nel.org>
Cc: Paolo Bonzini <pbonzini@...hat.com>
Cc: Paul E. McKenney <paulmck@...nel.org>
Cc: Thomas Bogendoerfer <tsbogend@...ha.franken.de>
---
arch/mips/kvm/mips.c | 37 ++++++++++++++++++++++++++++++++++---
1 file changed, 34 insertions(+), 3 deletions(-)
diff --git a/arch/mips/kvm/mips.c b/arch/mips/kvm/mips.c
index aa20d074d388..f18a3f39163f 100644
--- a/arch/mips/kvm/mips.c
+++ b/arch/mips/kvm/mips.c
@@ -438,6 +438,24 @@ int kvm_arch_vcpu_ioctl_set_guest_debug(struct kvm_vcpu *vcpu,
return -ENOIOCTLCMD;
}
+/*
+ * Actually run the vCPU, entering an RCU extended quiescent state (EQS) while
+ * the vCPU is running.
+ *
+ * This must be noinstr as instrumentation may make use of RCU, and this is not
+ * safe during the EQS.
+ */
+static int noinstr kvm_mips_vcpu_enter_exit(struct kvm_vcpu *vcpu)
+{
+ int ret;
+
+ exit_to_guest_mode();
+ ret = kvm_mips_callbacks->vcpu_run(vcpu);
+ enter_from_guest_mode();
+
+ return ret;
+}
+
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
{
int r = -EINTR;
@@ -458,7 +476,7 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
lose_fpu(1);
local_irq_disable();
- guest_enter_irqoff();
+ guest_timing_enter_irqoff();
trace_kvm_enter(vcpu);
/*
@@ -469,10 +487,23 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
*/
smp_store_mb(vcpu->mode, IN_GUEST_MODE);
- r = kvm_mips_callbacks->vcpu_run(vcpu);
+ r = kvm_mips_vcpu_enter_exit(vcpu);
+
+ /*
+ * We must ensure that any pending interrupts are taken before
+ * we exit guest timing so that timer ticks are accounted as
+ * guest time. Transiently unmask interrupts so that any
+ * pending interrupts are taken.
+ *
+ * TODO: is there a barrier which ensures that pending interrupts are
+ * recognised? Currently this just hopes that the CPU takes any pending
+ * interrupts between the enable and disable.
+ */
+ local_irq_enable();
+ local_irq_disable();
trace_kvm_out(vcpu);
- guest_exit_irqoff();
+ guest_timing_exit_irqoff();
local_irq_enable();
out:
--
2.30.2
Powered by blists - more mailing lists