[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <b29b8c22-2fd4-4b5e-b755-9198874157c7@amd.com>
Date: Fri, 18 Apr 2025 17:47:55 +0530
From: Vasant Hegde <vasant.hegde@....com>
To: Joao Martins <joao.m.martins@...cle.com>,
Sean Christopherson <seanjc@...gle.com>
Cc: kvm@...r.kernel.org, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
Maxim Levitsky <mlevitsk@...hat.com>, David Matlack <dmatlack@...gle.com>,
Alejandro Jimenez <alejandro.j.jimenez@...cle.com>,
Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
Joerg Roedel <joro@...tes.org>, David Woodhouse <dwmw2@...radead.org>,
Lu Baolu <baolu.lu@...ux.intel.com>, Paolo Bonzini <pbonzini@...hat.com>
Subject: Re: [PATCH 64/67] iommu/amd: KVM: SVM: Allow KVM to control need for
GA log interrupts
Hi Joao, Sean,
On 4/9/2025 5:26 PM, Joao Martins wrote:
> On 04/04/2025 20:39, Sean Christopherson wrote:
>> Add plumbing to the AMD IOMMU driver to allow KVM to control whether or
>> not an IRTE is configured to generate GA log interrupts. KVM only needs a
>> notification if the target vCPU is blocking, so the vCPU can be awakened.
>> If a vCPU is preempted or exits to userspace, KVM clears is_run, but will
>> set the vCPU back to running when userspace does KVM_RUN and/or the vCPU
>> task is scheduled back in, i.e. KVM doesn't need a notification.
>>
>> Unconditionally pass "true" in all KVM paths to isolate the IOMMU changes
>> from the KVM changes insofar as possible.
>>
>> Opportunistically swap the ordering of parameters for amd_iommu_update_ga()
>> so that the match amd_iommu_activate_guest_mode().
>
> Unfortunately I think this patch and the next one might be riding on the
> assumption that amd_iommu_update_ga() is always cheap :( -- see below.
>
> I didn't spot anything else flawed in the series though, just this one. I would
> suggest holding off on this and the next one, while progressing with the rest of
> the series.
>
>> diff --git a/drivers/iommu/amd/iommu.c b/drivers/iommu/amd/iommu.c
>> index 2e016b98fa1b..27b03e718980 100644
>> --- a/drivers/iommu/amd/iommu.c
>> +++ b/drivers/iommu/amd/iommu.c
>> -static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
>> +static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu,
>> + bool ga_log_intr)
>> {
>> if (cpu >= 0) {
>> entry->lo.fields_vapic.destination =
>> @@ -3783,12 +3784,14 @@ static void __amd_iommu_update_ga(struct irte_ga *entry, int cpu)
>> entry->hi.fields.destination =
>> APICID_TO_IRTE_DEST_HI(cpu);
>> entry->lo.fields_vapic.is_run = true;
>> + entry->lo.fields_vapic.ga_log_intr = false;
>> } else {
>> entry->lo.fields_vapic.is_run = false;
>> + entry->lo.fields_vapic.ga_log_intr = ga_log_intr;
>> }
>> }
>>
>
> isRun, Destination and GATag are not cached. Quoting the update from a few years
> back (page 93 of IOMMU spec dated Feb 2025):
>
> | When virtual interrupts are enabled by setting MMIO Offset 0018h[GAEn] and
> | IRTE[GuestMode=1], IRTE[IsRun], IRTE[Destination], and if present IRTE[GATag],
> | are not cached by the IOMMU. Modifications to these fields do not require an
> | invalidation of the Interrupt Remapping Table.
>
> This is the reason we were able to get rid of the IOMMU invalidation in
> amd_iommu_update_ga() ... which sped up vmexit/vmenter flow with iommu avic.
> Besides the lock contention that was observed at the time, we were seeing stalls
> in this path with enough vCPUs IIRC; CCing Alejandro to keep me honest.
>
> Now this change above is incorrect as is and to make it correct: you will need
> xor with the previous content of the IRTE::ga_log_intr and then if it changes
> then you re-add back an invalidation command via
> iommu_flush_irt_and_complete()). The latter is what I am worried will
> reintroduce these above problem :(
>
> The invalidation command (which has a completion barrier to serialize
> invalidation execution) takes some time in h/w, and will make all your vcpus
> content on the irq table lock (as is). Even assuming you somehow move the
> invalidation outside the lock, you will content on the iommu lock (for the>
command queue) or best case assuming no locks (which I am not sure it is
> possible) you will need to wait for the command to complete until you can
> progress forward with entering/exiting.
>
> Unless the GALogIntr bit is somehow also not cached too which wouldn't need the
> invalidation command (which would be good news!). Adding Suravee/Vasant here.
I have checked with HW architects. Its not cached. So we don't need invalidation
after updating GALogIntr field in IRTE.
-Vasant
Powered by blists - more mailing lists