lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aAKelotoUX3qCINt@google.com>
Date: Fri, 18 Apr 2025 11:48:54 -0700
From: Sean Christopherson <seanjc@...gle.com>
To: Vasant Hegde <vasant.hegde@....com>
Cc: Joao Martins <joao.m.martins@...cle.com>, kvm@...r.kernel.org, iommu@...ts.linux.dev, 
	linux-kernel@...r.kernel.org, Maxim Levitsky <mlevitsk@...hat.com>, 
	David Matlack <dmatlack@...gle.com>, Alejandro Jimenez <alejandro.j.jimenez@...cle.com>, 
	Suravee Suthikulpanit <suravee.suthikulpanit@....com>, Joerg Roedel <joro@...tes.org>, 
	David Woodhouse <dwmw2@...radead.org>, Lu Baolu <baolu.lu@...ux.intel.com>, 
	Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 64/67] iommu/amd: KVM: SVM: Allow KVM to control need for
 GA log interrupts

On Fri, Apr 18, 2025, Vasant Hegde wrote:
> On 4/9/2025 5:26 PM, Joao Martins wrote:
> > On 04/04/2025 20:39, Sean Christopherson wrote:
> >> Add plumbing to the AMD IOMMU driver to allow KVM to control whether or
> >> not an IRTE is configured to generate GA log interrupts.  KVM only needs a
> >> notification if the target vCPU is blocking, so the vCPU can be awakened.
> >> If a vCPU is preempted or exits to userspace, KVM clears is_run, but will
> >> set the vCPU back to running when userspace does KVM_RUN and/or the vCPU
> >> task is scheduled back in, i.e. KVM doesn't need a notification.
> >>
> >> Unconditionally pass "true" in all KVM paths to isolate the IOMMU changes
> >> from the KVM changes insofar as possible.
> >>
> >> Opportunistically swap the ordering of parameters for amd_iommu_update_ga()
> >> so that the match amd_iommu_activate_guest_mode().
> > 
> > Unfortunately I think this patch and the next one might be riding on the
> > assumption that amd_iommu_update_ga() is always cheap :( -- see below.
> > 
> > I didn't spot anything else flawed in the series though, just this one. I would
> > suggest holding off on this and the next one, while progressing with the rest of
> > the series.
> > 
> >> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
> >> index 2e016b98fa1b..27b03e718980 100644
> >> --- a/drivers/iommu/amd/iommu.c
> >> +++ b/drivers/iommu/amd/iommu.c
> >> -static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
> >> +static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu,
> >> +				  bool ga_log_intr)
> >>  {
> >>  	if (cpu >= 0) {
> >>  		entry->lo.fields_vapic.destination =
> >> @@ -3783,12 +3784,14 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
> >>  		entry->hi.fields.destination =
> >>  					APICID_TO_IRTE_DEST_HI(cpu);
> >>  		entry->lo.fields_vapic.is_run = true;
> >> +		entry->lo.fields_vapic.ga_log_intr = false;
> >>  	} else {
> >>  		entry->lo.fields_vapic.is_run = false;
> >> +		entry->lo.fields_vapic.ga_log_intr = ga_log_intr;
> >>  	}
> >>  }
> >>
> > 
> > isRun, Destination and GATag are not cached. Quoting the update from a few years
> > back (page 93 of IOMMU spec dated Feb 2025):
> > 
> > | When virtual interrupts are enabled by setting MMIO Offset 0018h[GAEn] and
> > | IRTE[GuestMode=1], IRTE[IsRun], IRTE[Destination], and if present IRTE[GATag],
> > | are not cached by the IOMMU. Modifications to these fields do not require an
> > | invalidation of the Interrupt Remapping Table.
> > 
> > This is the reason we were able to get rid of the IOMMU invalidation in
> > amd_iommu_update_ga() ... which sped up vmexit/vmenter flow with iommu avic.
> > Besides the lock contention that was observed at the time, we were seeing stalls
> > in this path with enough vCPUs IIRC; CCing Alejandro to keep me honest.
> > 
> > Now this change above is incorrect as is and to make it correct: you will need
> > xor with the previous content of the IRTE::ga_log_intr and then if it changes
> > then you re-add back an invalidation command via
> > iommu_flush_irt_and_complete()). The latter is what I am worried will
> > reintroduce these above problem :(
> > 
> > The invalidation command (which has a completion barrier to serialize
> > invalidation execution) takes some time in h/w, and will make all your vcpus
> > content on the irq table lock (as is). Even assuming you somehow move the
> > invalidation outside the lock, you will content on the iommu lock (for the>
> command queue) or best case assuming no locks (which I am not sure it is
> > possible) you will need to wait for the command to complete until you can
> > progress forward with entering/exiting.
> > 
> > Unless the GALogIntr bit is somehow also not cached too which wouldn't need the
> > invalidation command (which would be good news!). Adding Suravee/Vasant here.
> 
> I have checked with HW architects. Its not cached. So we don't need invalidation
> after updating GALogIntr field in IRTE.

Woot!  Better to be lucky than good :-)

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ