linux-kernel - Re: [PATCH 4/6] KVM: SVM: fix races in the AVIC incomplete IPI delivery to vCPUs

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <YbIjCUAECOyIbsYQ@google.com>
Date:   Thu, 9 Dec 2021 15:38:49 +0000
From:   Sean Christopherson <seanjc@...gle.com>
To:     Maxim Levitsky <mlevitsk@...hat.com>
Cc:     kvm@...r.kernel.org,
        "open list:X86 ARCHITECTURE (32-BIT AND 64-BIT)" 
        <linux-kernel@...r.kernel.org>, Wanpeng Li <wanpengli@...cent.com>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Joerg Roedel <joro@...tes.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Vitaly Kuznetsov <vkuznets@...hat.com>,
        Borislav Petkov <bp@...en8.de>,
        "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" <x86@...nel.org>,
        Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>,
        Jim Mattson <jmattson@...gle.com>
Subject: Re: [PATCH 4/6] KVM: SVM: fix races in the AVIC incomplete IPI
 delivery to vCPUs

On Thu, Dec 09, 2021, Maxim Levitsky wrote:
> If the target vCPU has AVIC inhibited while the source vCPU isn't,
> we need to set irr_pending, for the target to notice the interrupt.
> Do it always to be safe, the same as in svm_deliver_avic_intr.
> 
> Also if the target has AVIC inhibited, the same kind of races
> that happen in svm_deliver_avic_intr can happen here as well,
> so apply the same approach of kicking the target vCPUs.
> 
> Signed-off-by: Maxim Levitsky <mlevitsk@...hat.com>
> ---
>  arch/x86/kvm/svm/avic.c | 13 +++++++++++--
>  1 file changed, 11 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 8c1b934bfa9b..bdfc37caa64a 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -304,8 +304,17 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>  	kvm_for_each_vcpu(i, vcpu, kvm) {
>  		if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
>  					GET_APIC_DEST_FIELD(icrh),
> -					icrl & APIC_DEST_MASK))
> -			kvm_vcpu_wake_up(vcpu);
> +					icrl & APIC_DEST_MASK)) {

What about leveraging svm_deliver_avic_intr() to handle this so that all the
logic to handle this mess is more or less contained in one location?  And if the
vCPU has made its way back to the guest with APICv=1, we'd avoid an unnecessary
kick.

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 26ed5325c593..cf9f5caa6e1b 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -304,8 +304,12 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
        kvm_for_each_vcpu(i, vcpu, kvm) {
                if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
                                        GET_APIC_DEST_FIELD(icrh),
-                                       icrl & APIC_DEST_MASK))
-                       kvm_vcpu_wake_up(vcpu);
+                                       icrl & APIC_DEST_MASK)) {
+                       if (svm_deliver_avic_intr(vcpu, -1) {
+                               vcpu->arch.apic->irr_pending = true;
+                               kvm_make_request(KVM_REQ_EVENT, vcpu);
+                       }
+               }
        }
 }
 

And change svm_deliver_avic_intr() to ignore a negative vector, probably with a
comment saying that means the vector has already been set in the IRR.

	if (vec > 0) {
		kvm_lapic_set_irr(vec, vcpu->arch.apic);

		/*
		* Pairs with the smp_mb_*() after setting vcpu->guest_mode in
		* vcpu_enter_guest() to ensure the write to the vIRR is ordered before
		* the read of guest_mode, which guarantees that either VMRUN will see
		* and process the new vIRR entry, or that the below code will signal
		* the doorbell if the vCPU is already running in the guest.
		*/
		smp_mb__after_atomic();
	}