[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20251108230348.GJ1932966@nvidia.com>
Date: Sat, 8 Nov 2025 19:03:48 -0400
From: Jason Gunthorpe <jgg@...dia.com>
To: Vasant Hegde <vasant.hegde@....com>
Cc: Suravee Suthikulpanit <suravee.suthikulpanit@....com>,
nicolinc@...dia.com, linux-kernel@...r.kernel.org,
robin.murphy@....com, will@...nel.org, joro@...tes.org,
kevin.tian@...el.com, jsnitsel@...hat.com, iommu@...ts.linux.dev,
santosh.shukla@....com, sairaj.arunkodilkar@....com,
jon.grimm@....com, prashanthpra@...gle.com, wvw@...gle.com,
wnliu@...gle.com, gptran@...gle.com, kpsingh@...gle.com,
joao.m.martins@...cle.com, alejandro.j.jimenez@...cle.com
Subject: Re: [PATCH v4 15/16] iommu/amd: Refactor logic to program the host
page table in DTE
On Sat, Nov 08, 2025 at 10:56:38PM +0530, Vasant Hegde wrote:
> > On Tue, Oct 21, 2025 at 01:43:23AM +0000, Suravee Suthikulpanit wrote:
> >> @@ -2088,37 +2104,28 @@ static void set_dte_entry(struct amd_iommu *iommu,
> >> else
> >> domid = domain->id;
> >>
> >> /*
> >> * When SNP is enabled, we can only support TV=1 with non-zero domain ID.
> >> * This is prevented by the SNP-enable and IOMMU_DOMAIN_IDENTITY check in
> >> * do_iommu_domain_alloc().
> >> */
> >> WARN_ON(amd_iommu_snp_en && (domid == 0));
> >>
> >> + amd_iommu_make_clear_dte(dev_data, &new);
> >>
> >> + old_domid = READ_ONCE(dte->data[1]) & DTE_DOMID_MASK;
> >
> > This old_domid stuff doesn't make any sense. I think the commit that
> > added it is bonkers: 36b7200f67df ("iommu/amd: Flush old domains in kdump kernel")
> >
> > The problem is that the dom ids that are present in the re-used DTE
> > table have to be reserved from the domain id alloctor at boot.
> > > If the kdump kernel tries to create new DTEs it MUST NOT re-use any
> > IDs that are actively being using in DTEs already or you get data
> > corruption. Randomly flushing IDs is just getting lucky..
>
> Good catch. Thanks!
>
> Looks like commit 38e5f33ee359 ("iommu/amd: Reuse device table for kdump") broke
> domain ID reservation in kdump path :-( We will fix it.
I suggest just reading the whole DTE and permanently reserving any
DomIDs from the ida. Just leak them forever, that is good enough for a
kdump kernel. Then get rid of this stuff above.
> > (though how does IDENTITY on a device with a PASID installed work?)
>
> Probably set_dte_identity() should call set_dte_gcr3_table () and update
> relevant fields.
Yeah, maybe so, though it might be best if the caller can detect it
and have it call both functions instead of trying to bury it.
There is a fair number of variations here as you have
blocked/identity, RID/PASID attach and PASID detach to all consider.
It would be nice if it worked properly, identity on RID while PASID is
in use has a strong real use case.
Jason
Powered by blists - more mailing lists