linux-kernel - Re: [PATCH] KVM: x86: emulate wait-for-SIPI and SIPI-VMExit

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <187635d0-7786-5d8f-a41a-45a6abd9d001@redhat.com>
Date:   Thu, 5 Nov 2020 17:08:49 +0100
From:   Paolo Bonzini <pbonzini@...hat.com>
To:     yadong.qi@...el.com, kvm@...r.kernel.org,
        linux-kernel@...r.kernel.org, x86@...nel.org
Cc:     sean.j.christopherson@...el.com, vkuznets@...hat.com,
        wanpengli@...cent.com, jmattson@...gle.com, joro@...tes.org,
        tglx@...utronix.de, mingo@...hat.com, bp@...en8.de, hpa@...or.com,
        liran.alon@...cle.com, nikita.leshchenko@...cle.com,
        chao.gao@...el.com, kevin.tian@...el.com, luhai.chen@...el.com,
        bing.zhu@...el.com, kai.z.wang@...el.com
Subject: Re: [PATCH] KVM: x86: emulate wait-for-SIPI and SIPI-VMExit

On 22/09/20 07:23, yadong.qi@...el.com wrote:
> From: Yadong Qi <yadong.qi@...el.com>
> 
> Background: We have a lightweight HV, it needs INIT-VMExit and
> SIPI-VMExit to wake-up APs for guests since it do not monitor
> the Local APIC. But currently virtual wait-for-SIPI(WFS) state
> is not supported in nVMX, so when running on top of KVM, the L1
> HV cannot receive the INIT-VMExit and SIPI-VMExit which cause
> the L2 guest cannot wake up the APs.
> 
> According to Intel SDM Chapter 25.2 Other Causes of VM Exits,
> SIPIs cause VM exits when a logical processor is in
> wait-for-SIPI state.
> 
> In this patch:
>      1. introduce SIPI exit reason,
>      2. introduce wait-for-SIPI state for nVMX,
>      3. advertise wait-for-SIPI support to guest.
> 
> When L1 hypervisor is not monitoring Local APIC, L0 need to emulate
> INIT-VMExit and SIPI-VMExit to L1 to emulate INIT-SIPI-SIPI for
> L2. L2 LAPIC write would be traped by L0 Hypervisor(KVM), L0 should
> emulate the INIT/SIPI vmexit to L1 hypervisor to set proper state
> for L2's vcpu state.

There is a problem in this patch, in that this change is incorrect:

> 
> @@ -2847,7 +2847,8 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu)
>  	 */
>  	if (kvm_vcpu_latch_init(vcpu)) {
>  		WARN_ON_ONCE(vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED);
> -		if (test_bit(KVM_APIC_SIPI, &apic->pending_events))
> +		if (test_bit(KVM_APIC_SIPI, &apic->pending_events) &&
> +		    !is_guest_mode(vcpu))
>  			clear_bit(KVM_APIC_SIPI, &apic->pending_events);
>  		return;
>  	}

Here you're not trying to process a latched INIT; you just want to delay 
the processing of the SIPI until check_nested_events.

The change does have a correct part in it.  In particular, 
vmx_apic_init_signal_blocked should have been

diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 47b8357b9751..64339121a4f0 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7558,7 +7558,7 @@ static void enable_smi_window(struct kvm_vcpu *vcpu)

  static bool vmx_apic_init_signal_blocked(struct kvm_vcpu *vcpu)
  {
-	return to_vmx(vcpu)->nested.vmxon;
+	return to_vmx(vcpu)->nested.vmxon && !is_guest_mode(vcpu);
  }

  static void vmx_migrate_timers(struct kvm_vcpu *vcpu)

to only latch INIT signals in root mode.  However, SIPI must be cleared 
unconditionally on SVM; the "!is_guest_mode" test in that case is incorrect.

The right way to do it is to call check_nested_events from 
kvm_apic_accept_events.  This will cause an INIT or SIPI vmexit, as 
required.  There is some extra complication to read pending_events 
*before* kvm_apic_accept_events and not steal from the guest any INIT or 
SIPI that is sent after kvm_apic_accept_events returns.

Thanks to your test case, I will test a patch and send it.

Paolo