[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ab2b29ac-adcf-12ab-7c53-cc427995e94c@amd.com>
Date: Mon, 11 Mar 2019 11:38:05 +0000
From: "Suthikulpanit, Suravee" <Suravee.Suthikulpanit@....com>
To: Oren Twaig <oren@...lemp.com>
CC: "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"x86@...nel.org" <x86@...nel.org>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] svm: Fix AVIC incomplete IPI emulation
Hi Oren,
Sorry for delay response.
On 3/5/19 1:15 AM, Oren Twaig wrote:
> Hello Suravee,
>
> According to AMD's SDM, the target-not-running incomplete
> ipi exit is only received if any of the destination cpus had the
> not-running bit set in the avic backing page.
I believe you are referring to the "isRunning" (IR) bit is in the
AVIC physical APIC ID table entry.
> However, not before the CPU _already_ set the relevant IRR bit
> in all these cpus.
Not sure what you meant here.
> In this change, the patch forces KVM to send another interrupt
> to the vcpu whether SVM already did that or not. Which means
> the vcpu/s, under some conditions, can get an EXTRA interrupt
> it never intended to get >> Example:
> 1. vcpu B: Is in "not-running" state.
> 2. vcpu A: Writes to the ICR to send vector 80 to vcpu B
> 3. vcpu A: SVM updates vcpu B IRR with bit 80
> 4. vcpu A: SVM exits on incomplete IPI target-not-running exit.
> 5. vcpu A: Now stops executing any code @ hypervisor level.
> 6. vcpu B: Due to another interrupt (like lapic timer)
> resumes running the guest. While handling interrupts,
> it also handles interrupt vector 80 (as it's in his IRR)
> 7. vcpu A: resumes executing the below code and sends
> an _additional_interrupt to vcpu B.
>
> Overall, vcpu B got two interrupts. The second is unwanted and
> not documented in the system architecture.
>
> Can you please elaborate more to why the implementation
> below conflict with the specifications (which was the code
> before this commit) ?
This patch was introduced to fix an issue where the SVM driver tries to
handle the step 5 above by scheduling vcpu B into _running_ state to handle
the IPI from vcpu A. However, prior to this patch, vcpu B was never get
scheduled to run unless there are other interrupts (e.g. timer).
This should not be the case as Vcpu B should have been running regardless
of other interrupts. So, I don't think step 6 and 7 above are correct.
The issue was caused by the apic->irr_pending not set to true when trying to
get vcpu B scheduled. This flag is checked in apic_find_highest_irr() before
searching for the highest bit.
To fix the issue, I decided to leverage the existing emulation code for
ICR and ICR2, which in turn calls apic_send_ipi() to deliver interrupt to vpu B.
However, looking a bit more closely, I notice the logic in svm_deliver_avic_intr()
should also have been changed from kvm_vcpu_wake_up() to kvm_vcpu_kick()
since the latter will result in clearing the IRR bit for the IPI vector
when trying to send IPI as part of the following call path.
vcpu_enter_guest()
|-- inject_pending_event()
|-- kvm_cpu_get_interrupt()
|-- kvm_get_apic_interrupt()
|-- apic_clear_irr()
|-- apic_set_isr()
|-- apic_update_ppr() ....
Please see the patch below.
Not sure if this would address the problem you are seeing.
Thanks,
Suravee
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 24dfa6a93711..d2841c3dbc04 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -5219,11 +5256,13 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
kvm_lapic_set_irr(vec, vcpu->arch.apic);
smp_mb__after_atomic();
- if (avic_vcpu_is_running(vcpu))
+ if (avic_vcpu_is_running(vcpu)) {
wrmsrl(SVM_AVIC_DOORBELL,
kvm_cpu_get_apicid(vcpu->cpu));
- else
- kvm_vcpu_wake_up(vcpu);
+ } else {
+ kvm_make_request(KVM_REQ_EVENT, vcpu);
+ kvm_vcpu_kick(vcpu);
+ }
}
static void svm_ir_list_del(struct vcpu_svm *svm, struct amd_iommu_pi_data *pi)
Powered by blists - more mailing lists