[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8b57d98d-aa53-40d9-ac5e-3f9b74643b38@redhat.com>
Date: Wed, 4 Jun 2025 19:11:54 +0200
From: Paolo Bonzini <pbonzini@...hat.com>
To: Sean Christopherson <seanjc@...gle.com>, Joerg Roedel <joro@...tes.org>,
David Woodhouse <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>
Cc: kvm@...r.kernel.org, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
Sairaj Kodilkar <sarunkod@....com>, Vasant Hegde <vasant.hegde@....com>,
Maxim Levitsky <mlevitsk@...hat.com>,
Joao Martins <joao.m.martins@...cle.com>,
Francesco Lavra <francescolavra.fl@...il.com>,
David Matlack <dmatlack@...gle.com>
Subject: Re: [PATCH v2 00/59] KVM: iommu: Overhaul device posted IRQs support
On 5/23/25 02:59, Sean Christopherson wrote:
> TL;DR: Overhaul device posted interrupts in KVM and IOMMU, and AVIC in
> general.
>
> This applies on the series to add CONFIG_KVM_IOAPIC (and to kill irq_comm.c):
>
> https://lore.kernel.org/all/20250519232808.2745331-1-seanjc@google.com
>
> Fix a variety of bugs related to device posted IRQs, especially on the
> AMD side, and clean up KVM's implementation (this series actually removes
> more code than it adds).
>
> Stating the obvious, this series is comically large. Though it's smaller than
> v1! (Ignoring that I cheated by moving 15 patches to a prep series, and that
> Paolo already grabbed several patches).
>
> Sairaj, I applied your Tested-by somewhat sparingly, as some of the patches
> changed (most notably "Consolidate IRTE update when toggling AVIC on/off").
> Please holler if you want me to remove/add any tags. And when you get time,
> I'd greatly appreciate a sanity check!
>
> Batch #1 is mostly SVM specific:
>
> - Cleans up various warts and bugs in the IRTE tracking
> - Fixes AVIC to not reject large VMs (honor KVM's ABI)
> - Wire up AVIC to enable_ipiv to support disabling IPI virtualization while
> still utilizing device posted interrupts, and to workaround erratum #1235.
>
> Batch #3 overhauls the guts of IRQ bypass in KVM, and moves the vast majority
> of the logic to common x86; only the code that needs to communicate with the
> IOMMU is truly vendor specific.
>
> Batch #4 is more SVM/AVIC cleanups that are made possible by batch #3.
>
> Batch #5 adds WARNs and drops dead code after all the previous cleanups and
> fixes (I don't want to add the WARNs earlier; I don't see any point in adding
> WARNs in code that's known to be broken).
>
> Batch #6 is yet more SVM/AVIC cleanups, with the specific goal of configuring
> IRTEs to generate GA log interrupts if and only if KVM actually needs a wake
> event.
Looks good - it's not even that different from v1. Thanks!
Paolo
> v2:
> - Drop patches that were already merged.
> - Move code into irq.c, not x86.c. [Paolo]
> - Collect review/testing tags. [Sairaj, Vasant]
> - Sqaush fixup for a comment that was added in the prior patch. [Sairaj]
> - Rewrote the changelog for "Delete IRTE link from previous vCPU irrespective
> of new routing". [Sairaj]
> - Actually drop "struct amd_svm_iommu_ir" and all usage in "Track per-vCPU
> IRTEs using kvm_kernel_irqfd structure" (the previous version was getting
> hilarious lucky with struct offsets). [Sairaj]
> - Drop unused params from kvm_pi_update_irte() and pi_update_irte(). [Sairaj]
> - Document the rules and behavior of amd_iommu_update_ga(). [Joerg]
> - Fix a changelog typo. [Paolo]
> - Document that GALogIntr isn't cached, i.e. can be safely updated without
> an invalidation. [Joao, Vasant]
> - Rework avic_vcpu_{load,put}() to use an enumerated parameter instead of a
> series of booleans. [Paolo]
> - Drop a redundant "&& new". [Francesco]
> - Drop the *** DO NOT MERGE *** testing hack patches.
>
> v1: https://lore.kernel.org/all/20250404193923.1413163-1-seanjc@google.com
>
> Maxim Levitsky (2):
> KVM: SVM: Add enable_ipiv param, never set IsRunning if disabled
> KVM: SVM: Disable (x2)AVIC IPI virtualization if CPU has erratum #1235
>
> Sean Christopherson (57):
> KVM: x86: Pass new routing entries and irqfd when updating IRTEs
> KVM: SVM: Track per-vCPU IRTEs using kvm_kernel_irqfd structure
> KVM: SVM: Delete IRTE link from previous vCPU before setting new IRTE
> iommu/amd: KVM: SVM: Delete now-unused cached/previous GA tag fields
> KVM: SVM: Delete IRTE link from previous vCPU irrespective of new
> routing
> KVM: SVM: Drop pointless masking of default APIC base when setting
> V_APIC_BAR
> KVM: SVM: Drop pointless masking of kernel page pa's with AVIC HPA
> masks
> KVM: SVM: Add helper to deduplicate code for getting AVIC backing page
> KVM: SVM: Drop vcpu_svm's pointless avic_backing_page field
> KVM: SVM: Inhibit AVIC if ID is too big instead of rejecting vCPU
> creation
> KVM: SVM: Drop redundant check in AVIC code on ID during vCPU creation
> KVM: SVM: Track AVIC tables as natively sized pointers, not "struct
> pages"
> KVM: SVM: Drop superfluous "cache" of AVIC Physical ID entry pointer
> KVM: VMX: Move enable_ipiv knob to common x86
> KVM: VMX: Suppress PI notifications whenever the vCPU is put
> KVM: SVM: Add a comment to explain why avic_vcpu_blocking() ignores
> IRQ blocking
> iommu/amd: KVM: SVM: Use pi_desc_addr to derive ga_root_ptr
> iommu/amd: KVM: SVM: Pass NULL @vcpu_info to indicate "not guest mode"
> KVM: SVM: Stop walking list of routing table entries when updating
> IRTE
> KVM: VMX: Stop walking list of routing table entries when updating
> IRTE
> KVM: SVM: Extract SVM specific code out of get_pi_vcpu_info()
> KVM: x86: Move IRQ routing/delivery APIs from x86.c => irq.c
> KVM: x86: Nullify irqfd->producer after updating IRTEs
> KVM: x86: Dedup AVIC vs. PI code for identifying target vCPU
> KVM: x86: Move posted interrupt tracepoint to common code
> KVM: SVM: Clean up return handling in avic_pi_update_irte()
> iommu: KVM: Split "struct vcpu_data" into separate AMD vs. Intel
> structs
> KVM: Don't WARN if updating IRQ bypass route fails
> KVM: Fold kvm_arch_irqfd_route_changed() into
> kvm_arch_update_irqfd_routing()
> KVM: x86: Track irq_bypass_vcpu in common x86 code
> KVM: x86: Skip IOMMU IRTE updates if there's no old or new vCPU being
> targeted
> KVM: x86: Don't update IRTE entries when old and new routes were !MSI
> KVM: SVM: Revert IRTE to legacy mode if IOMMU doesn't provide IR
> metadata
> KVM: SVM: Take and hold ir_list_lock across IRTE updates in IOMMU
> iommu/amd: Document which IRTE fields amd_iommu_update_ga() can modify
> iommu/amd: KVM: SVM: Infer IsRun from validity of pCPU destination
> iommu/amd: Factor out helper for manipulating IRTE GA/CPU info
> iommu/amd: KVM: SVM: Set pCPU info in IRTE when setting vCPU affinity
> iommu/amd: KVM: SVM: Add IRTE metadata to affined vCPU's list if AVIC
> is inhibited
> KVM: SVM: Don't check for assigned device(s) when updating affinity
> KVM: SVM: Don't check for assigned device(s) when activating AVIC
> KVM: SVM: WARN if (de)activating guest mode in IOMMU fails
> KVM: SVM: Process all IRTEs on affinity change even if one update
> fails
> KVM: SVM: WARN if updating IRTE GA fields in IOMMU fails
> KVM: x86: Drop superfluous "has assigned device" check in
> kvm_pi_update_irte()
> KVM: x86: WARN if IRQ bypass isn't supported in kvm_pi_update_irte()
> KVM: x86: WARN if IRQ bypass routing is updated without in-kernel
> local APIC
> KVM: SVM: WARN if ir_list is non-empty at vCPU free
> KVM: x86: Decouple device assignment from IRQ bypass
> KVM: VMX: WARN if VT-d Posted IRQs aren't possible when starting IRQ
> bypass
> KVM: SVM: Use vcpu_idx, not vcpu_id, for GA log tag/metadata
> iommu/amd: WARN if KVM calls GA IRTE helpers without virtual APIC
> support
> KVM: SVM: Fold avic_set_pi_irte_mode() into its sole caller
> KVM: SVM: Don't check vCPU's blocking status when toggling AVIC on/off
> KVM: SVM: Consolidate IRTE update when toggling AVIC on/off
> iommu/amd: KVM: SVM: Allow KVM to control need for GA log interrupts
> KVM: SVM: Generate GA log IRQs only if the associated vCPUs is
> blocking
>
> arch/x86/include/asm/irq_remapping.h | 17 +-
> arch/x86/include/asm/kvm-x86-ops.h | 2 +-
> arch/x86/include/asm/kvm_host.h | 20 +-
> arch/x86/include/asm/svm.h | 13 +-
> arch/x86/kvm/irq.c | 140 ++++++
> arch/x86/kvm/svm/avic.c | 702 ++++++++++++---------------
> arch/x86/kvm/svm/svm.c | 4 +
> arch/x86/kvm/svm/svm.h | 32 +-
> arch/x86/kvm/trace.h | 19 +-
> arch/x86/kvm/vmx/capabilities.h | 1 -
> arch/x86/kvm/vmx/main.c | 2 +-
> arch/x86/kvm/vmx/posted_intr.c | 140 ++----
> arch/x86/kvm/vmx/posted_intr.h | 10 +-
> arch/x86/kvm/vmx/vmx.c | 2 -
> arch/x86/kvm/x86.c | 90 +---
> drivers/iommu/amd/amd_iommu_types.h | 1 -
> drivers/iommu/amd/iommu.c | 125 +++--
> drivers/iommu/intel/irq_remapping.c | 10 +-
> include/linux/amd-iommu.h | 25 +-
> include/linux/kvm_host.h | 9 +-
> include/linux/kvm_irqfd.h | 4 +
> virt/kvm/eventfd.c | 22 +-
> 22 files changed, 672 insertions(+), 718 deletions(-)
>
>
> base-commit: 3debd5461fba1dcb33e732b16153da0cf5d0c251
Powered by blists - more mailing lists