[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <87tuearwc7.wl-maz@kernel.org>
Date: Tue, 11 Jan 2022 17:55:20 +0000
From: Marc Zyngier <maz@...nel.org>
To: Mark Rutland <mark.rutland@....com>
Cc: linux-kernel@...r.kernel.org, aleksandar.qemu.devel@...il.com,
alexandru.elisei@....com, anup.patel@....com,
aou@...s.berkeley.edu, atish.patra@....com,
benh@...nel.crashing.org, borntraeger@...ux.ibm.com, bp@...en8.de,
catalin.marinas@....com, chenhuacai@...nel.org,
dave.hansen@...ux.intel.com, david@...hat.com,
frankja@...ux.ibm.com, frederic@...nel.org, gor@...ux.ibm.com,
hca@...ux.ibm.com, imbrenda@...ux.ibm.com, james.morse@....com,
jmattson@...gle.com, joro@...tes.org, kvm@...r.kernel.org,
mingo@...hat.com, mpe@...erman.id.au, nsaenzju@...hat.com,
palmer@...belt.com, paulmck@...nel.org, paulus@...ba.org,
paul.walmsley@...ive.com, pbonzini@...hat.com, seanjc@...gle.com,
suzuki.poulose@....com, tglx@...utronix.de,
tsbogend@...ha.franken.de, vkuznets@...hat.com,
wanpengli@...cent.com, will@...nel.org
Subject: Re: [PATCH 2/5] kvm/arm64: rework guest entry logic
On Tue, 11 Jan 2022 15:35:36 +0000,
Mark Rutland <mark.rutland@....com> wrote:
>
> In kvm_arch_vcpu_ioctl_run() we enter an RCU extended quiescent state
> (EQS) by calling guest_enter_irqoff(), and unmasked IRQs prior to
> exiting the EQS by calling guest_exit(). As the IRQ entry code will not
> wake RCU in this case, we may run the core IRQ code and IRQ handler
> without RCU watching, leading to various potential problems.
>
> Additionally, we do not inform lockdep or tracing that interrupts will
> be enabled during guest execution, which caan lead to misleading traces
> and warnings that interrupts have been enabled for overly-long periods.
>
> This patch fixes these issues by using the new timing and context
> entry/exit helpers to ensure that interrupts are handled during guest
> vtime but with RCU watching, with a sequence:
>
> guest_timing_enter_irqoff();
>
> exit_to_guest_mode();
> < run the vcpu >
> enter_from_guest_mode();
>
> < take any pending IRQs >
>
> guest_timing_exit_irqoff();
>
> Since instrumentation may make use of RCU, we must also ensure that no
> instrumented code is run during the EQS. I've split out the critical
> section into a new kvm_arm_enter_exit_vcpu() helper which is marked
> noinstr.
>
> Fixes: 1b3d546daf85ed2b ("arm/arm64: KVM: Properly account for guest CPU time")
> Reported-by: Nicolas Saenz Julienne <nsaenzju@...hat.com>
> Signed-off-by: Mark Rutland <mark.rutland@....com>
> Cc: Alexandru Elisei <alexandru.elisei@....com>
> Cc: Catalin Marinas <catalin.marinas@....com>
> Cc: Frederic Weisbecker <frederic@...nel.org>
> Cc: James Morse <james.morse@....com>
> Cc: Marc Zyngier <maz@...nel.org>
> Cc: Paolo Bonzini <pbonzini@...hat.com>
> Cc: Paul E. McKenney <paulmck@...nel.org>
> Cc: Suzuki K Poulose <suzuki.poulose@....com>
> Cc: Will Deacon <will@...nel.org>
> ---
> arch/arm64/kvm/arm.c | 51 ++++++++++++++++++++++++++++----------------
> 1 file changed, 33 insertions(+), 18 deletions(-)
>
> diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c
> index e4727dc771bf..1721df2522c8 100644
> --- a/arch/arm64/kvm/arm.c
> +++ b/arch/arm64/kvm/arm.c
> @@ -764,6 +764,24 @@ static bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu, int *ret)
> xfer_to_guest_mode_work_pending();
> }
>
> +/*
> + * Actually run the vCPU, entering an RCU extended quiescent state (EQS) while
> + * the vCPU is running.
> + *
> + * This must be noinstr as instrumentation may make use of RCU, and this is not
> + * safe during the EQS.
> + */
> +static int noinstr kvm_arm_vcpu_enter_exit(struct kvm_vcpu *vcpu)
> +{
> + int ret;
> +
> + exit_to_guest_mode();
> + ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> + enter_from_guest_mode();
> +
> + return ret;
> +}
> +
> /**
> * kvm_arch_vcpu_ioctl_run - the main VCPU run function to execute guest code
> * @vcpu: The VCPU pointer
> @@ -854,9 +872,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> * Enter the guest
> */
> trace_kvm_entry(*vcpu_pc(vcpu));
> - guest_enter_irqoff();
> + guest_timing_enter_irqoff();
>
> - ret = kvm_call_hyp_ret(__kvm_vcpu_run, vcpu);
> + ret = kvm_arm_vcpu_enter_exit(vcpu);
>
> vcpu->mode = OUTSIDE_GUEST_MODE;
> vcpu->stat.exits++;
> @@ -891,26 +909,23 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu)
> kvm_arch_vcpu_ctxsync_fp(vcpu);
>
> /*
> - * We may have taken a host interrupt in HYP mode (ie
> - * while executing the guest). This interrupt is still
> - * pending, as we haven't serviced it yet!
> + * We must ensure that any pending interrupts are taken before
> + * we exit guest timing so that timer ticks are accounted as
> + * guest time. Transiently unmask interrupts so that any
> + * pending interrupts are taken.
> *
> - * We're now back in SVC mode, with interrupts
> - * disabled. Enabling the interrupts now will have
> - * the effect of taking the interrupt again, in SVC
> - * mode this time.
> + * Per ARM DDI 0487G.b section D1.13.4, an ISB (or other
> + * context synchronization event) is necessary to ensure that
> + * pending interrupts are taken.
> */
> local_irq_enable();
> + isb();
> + local_irq_disable();
Small nit: we may be able to elide this enable/isb/disable dance if a
read of ISR_EL1 returns 0.
> +
> + guest_timing_exit_irqoff();
> +
> + local_irq_enable();
>
> - /*
> - * We do local_irq_enable() before calling guest_exit() so
> - * that if a timer interrupt hits while running the guest we
> - * account that tick as being spent in the guest. We enable
> - * preemption after calling guest_exit() so that if we get
> - * preempted we make sure ticks after that is not counted as
> - * guest time.
> - */
> - guest_exit();
> trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
>
> /* Exit types that need handling before we can be preempted */
Reviewed-by: Marc Zyngier <maz@...nel.org>
M.
--
Without deviation from the norm, progress is not possible.
Powered by blists - more mailing lists