[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZOdmPqq6uXMSWOnV@google.com>
Date: Thu, 24 Aug 2023 07:16:30 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Like Xu <like.xu.linux@...il.com>
Cc: kvm@...r.kernel.org, linux-kernel@...r.kernel.org,
Peter Zijlstra <peterz@...radead.org>,
Andy Lutomirski <luto@...nel.org>,
Thomas Gleixner <tglx@...utronix.de>,
Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 7/7] KVM: VMX: Handle NMI VM-Exits in noinstr region
On Thu, Aug 24, 2023, Like Xu wrote:
> On 13/12/2022 2:09 pm, Sean Christopherson wrote:
> > Move VMX's handling of NMI VM-Exits into vmx_vcpu_enter_exit() so that
> > the NMI is handled prior to leaving the safety of noinstr. Handling the
> > NMI after leaving noinstr exposes the kernel to potential ordering
> > problems as an instrumentation-induced fault, e.g. #DB, #BP, #PF, etc.
> > will unblock NMIs when IRETing back to the faulting instruction.
> (3) In addition, trace_kvm_exit() should ideally appear before the host NMI
> trace logs, which makes it easier to understand.
Ideally, yes, but tracepoints are not remotely noinstr friendly.
> A proposal fix is to delay vmx_do_nmi_irqoff() a little bit, but not a revert move:
>
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index e6849f780dba..1f29b7f22da7 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7230,13 +7230,6 @@ static noinstr void vmx_vcpu_enter_exit(struct
> kvm_vcpu *vcpu,
> else
> vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
>
> - if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
> - is_nmi(vmx_get_intr_info(vcpu))) {
> - kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
> - vmx_do_nmi_irqoff();
> - kvm_after_interrupt(vcpu);
> - }
> -
> guest_state_exit_irqoff();
> }
>
> @@ -7389,6 +7382,13 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
>
> trace_kvm_exit(vcpu, KVM_ISA_VMX);
>
> + if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
> + is_nmi(vmx_get_intr_info(vcpu))) {
> + kvm_before_interrupt(vcpu, KVM_HANDLING_NMI);
> + vmx_do_nmi_irqoff();
> + kvm_after_interrupt(vcpu);
> + }
No, the whole point of doing NMI handling in vmx_vcpu_enter_exit() is so that NMIs
are serviced before instrumentation is enabled.
I think the below is sufficient (untested at this point). Not quite minimal, e.g.
I'm pretty sure there's (currently) no need to snapshot IDT_VECTORING_INFO_FIELD
so early, but I can't think of any reason to wait.
--
From: Sean Christopherson <seanjc@...gle.com>
Date: Thu, 24 Aug 2023 06:49:36 -0700
Subject: [PATCH] KVM: VMX: Refresh available regs and IDT vectoring info
before NMI handling
Reset the mask of available "registers" and refresh the IDT vectoring
info snapshot in vmx_vcpu_enter_exit(), before KVM potentially handles a
an NMI VM-Exit. One of the "registers" that KVM VMX lazily loads is the
vmcs.VM_EXIT_INTR_INFO field, which is holds the vector+type on "exception
or NMI" VM-Exits, i.e. is needed to identify NMIs. Clearing the available
registers bitmask after handling NMIs results in KVM querying info from
the last VM-Exit that read vmcs.VM_EXIT_INTR_INFO, and leads to both
missed NMIs and spurious NMIs from the guest's perspective.
Opportunistically grab vmcs.IDT_VECTORING_INFO_FIELD early in the VM-Exit
path too, e.g. to guard against similar consumption of stale data. The
field is read on every "normal" VM-Exit, and there's no point in delaying
the inevitable.
Reported-by: Like Xu <like.xu.linux@...il.com>
Fixes: 11df586d774f ("KVM: VMX: Handle NMI VM-Exits in noinstr region")
Cc: stable@...r.kernel.org
Signed-off-by: Sean Christopherson <seanjc@...gle.com>
---
arch/x86/kvm/vmx/vmx.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index e6849f780dba..d2b78ab7a9f2 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7222,13 +7222,20 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
flags);
vcpu->arch.cr2 = native_read_cr2();
+ vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
+
+ vmx->idt_vectoring_info = 0;
vmx_enable_fb_clear(vmx);
- if (unlikely(vmx->fail))
+ if (unlikely(vmx->fail)) {
vmx->exit_reason.full = 0xdead;
- else
- vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
+ goto out;
+ }
+
+ vmx->exit_reason.full = vmcs_read32(VM_EXIT_REASON);
+ if (likely(!vmx->exit_reason.failed_vmentry))
+ vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
if ((u16)vmx->exit_reason.basic == EXIT_REASON_EXCEPTION_NMI &&
is_nmi(vmx_get_intr_info(vcpu))) {
@@ -7237,6 +7244,7 @@ static noinstr void vmx_vcpu_enter_exit(struct kvm_vcpu *vcpu,
kvm_after_interrupt(vcpu);
}
+out:
guest_state_exit_irqoff();
}
@@ -7358,8 +7366,6 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
loadsegment(es, __USER_DS);
#endif
- vcpu->arch.regs_avail &= ~VMX_REGS_LAZY_LOAD_SET;
-
pt_guest_exit(vmx);
kvm_load_host_xsave_state(vcpu);
@@ -7376,17 +7382,12 @@ static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
vmx->nested.nested_run_pending = 0;
}
- vmx->idt_vectoring_info = 0;
-
if (unlikely(vmx->fail))
return EXIT_FASTPATH_NONE;
if (unlikely((u16)vmx->exit_reason.basic == EXIT_REASON_MCE_DURING_VMENTRY))
kvm_machine_check();
- if (likely(!vmx->exit_reason.failed_vmentry))
- vmx->idt_vectoring_info = vmcs_read32(IDT_VECTORING_INFO_FIELD);
-
trace_kvm_exit(vcpu, KVM_ISA_VMX);
if (unlikely(vmx->exit_reason.failed_vmentry))
base-commit: fff2e47e6c3b8050ca26656693caa857e3a8b740
--
Powered by blists - more mailing lists