[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ad53c9fe-a874-4554-bce5-a5bcfefe627a@amd.com>
Date: Fri, 11 Apr 2025 13:38:39 +0530
From: Sairaj Kodilkar <sarunkod@....com>
To: Sean Christopherson <seanjc@...gle.com>, Paolo Bonzini
<pbonzini@...hat.com>, Joerg Roedel <joro@...tes.org>, David Woodhouse
<dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>
CC: <kvm@...r.kernel.org>, <iommu@...ts.linux.dev>,
<linux-kernel@...r.kernel.org>, Maxim Levitsky <mlevitsk@...hat.com>, "Joao
Martins" <joao.m.martins@...cle.com>, David Matlack <dmatlack@...gle.com>,
Vasant Hegde <vasant.hegde@....com>, Naveen N Rao <naveen.rao@....com>
Subject: Re: [PATCH 02/67] KVM: x86: Reset IRTE to host control if *new* route
isn't postable
On 4/5/2025 1:08 AM, Sean Christopherson wrote:
> Restore an IRTE back to host control (remapped or posted MSI mode) if the
> *new* GSI route prevents posting the IRQ directly to a vCPU, regardless of
> the GSI routing type. Updating the IRTE if and only if the new GSI is an
> MSI results in KVM leaving an IRTE posting to a vCPU.
>
> The dangling IRTE can result in interrupts being incorrectly delivered to
> the guest, and in the worst case scenario can result in use-after-free,
> e.g. if the VM is torn down, but the underlying host IRQ isn't freed.
>
> Fixes: efc644048ecd ("KVM: x86: Update IRTE for posted-interrupts")
> Fixes: 411b44ba80ab ("svm: Implements update_pi_irte hook to setup posted interrupt")
> Cc: stable@...r.kernel.org
> Signed-off-by: Sean Christopherson <seanjc@...gle.com>
> ---
> arch/x86/kvm/svm/avic.c | 61 ++++++++++++++++++----------------
> arch/x86/kvm/vmx/posted_intr.c | 28 ++++++----------
> 2 files changed, 43 insertions(+), 46 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index a961e6e67050..ef08356fdb1c 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -896,6 +896,7 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
> {
> struct kvm_kernel_irq_routing_entry *e;
> struct kvm_irq_routing_table *irq_rt;
> + bool enable_remapped_mode = true;
> int idx, ret = 0;
>
> if (!kvm_arch_has_assigned_device(kvm) || !kvm_arch_has_irq_bypass())
> @@ -932,6 +933,8 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
> kvm_vcpu_apicv_active(&svm->vcpu)) {
> struct amd_iommu_pi_data pi;
>
> + enable_remapped_mode = false;
> +
> /* Try to enable guest_mode in IRTE */
> pi.base = __sme_set(page_to_phys(svm->avic_backing_page) &
> AVIC_HPA_MASK);
> @@ -950,33 +953,6 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
> */
> if (!ret && pi.is_guest_mode)
> svm_ir_list_add(svm, &pi);
> - } else {
> - /* Use legacy mode in IRTE */
> - struct amd_iommu_pi_data pi;
> -
> - /**
> - * Here, pi is used to:
> - * - Tell IOMMU to use legacy mode for this interrupt.
> - * - Retrieve ga_tag of prior interrupt remapping data.
> - */
> - pi.prev_ga_tag = 0;
> - pi.is_guest_mode = false;
> - ret = irq_set_vcpu_affinity(host_irq, &pi);
> -
> - /**
> - * Check if the posted interrupt was previously
> - * setup with the guest_mode by checking if the ga_tag
> - * was cached. If so, we need to clean up the per-vcpu
> - * ir_list.
> - */
> - if (!ret && pi.prev_ga_tag) {
> - int id = AVIC_GATAG_TO_VCPUID(pi.prev_ga_tag);
> - struct kvm_vcpu *vcpu;
> -
> - vcpu = kvm_get_vcpu_by_id(kvm, id);
> - if (vcpu)
> - svm_ir_list_del(to_svm(vcpu), &pi);
> - }
> }
>
> if (!ret && svm) {
> @@ -991,7 +967,36 @@ int avic_pi_update_irte(struct kvm *kvm, unsigned int host_irq,
> }
> }
>
> - ret = 0;
> + if (enable_remapped_mode) {
> + /* Use legacy mode in IRTE */
> + struct amd_iommu_pi_data pi;
> +
> + /**
> + * Here, pi is used to:
> + * - Tell IOMMU to use legacy mode for this interrupt.
> + * - Retrieve ga_tag of prior interrupt remapping data.
> + */
> + pi.prev_ga_tag = 0;
> + pi.is_guest_mode = false;
> + ret = irq_set_vcpu_affinity(host_irq, &pi);
> +
> + /**
> + * Check if the posted interrupt was previously
> + * setup with the guest_mode by checking if the ga_tag
> + * was cached. If so, we need to clean up the per-vcpu
> + * ir_list.
> + */
> + if (!ret && pi.prev_ga_tag) {
> + int id = AVIC_GATAG_TO_VCPUID(pi.prev_ga_tag);
> + struct kvm_vcpu *vcpu;
> +
> + vcpu = kvm_get_vcpu_by_id(kvm, id);
> + if (vcpu)
> + svm_ir_list_del(to_svm(vcpu), &pi);
> + }
> + } else {
> + ret = 0;
> + }
Hi Sean,
I think you can remove this else and "ret = 0". Because Code will come
to this point when irq_set_vcpu_affinity() is successful, ensuring that
ret is 0.
.../...
Regards
Sairaj Kodilkar
Powered by blists - more mailing lists