[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230201140915.000024a0@Huawei.com>
Date: Wed, 1 Feb 2023 14:09:15 +0000
From: Jonathan Cameron <Jonathan.Cameron@...wei.com>
To: Jason Gunthorpe <jgg@...dia.com>
CC: Bjorn Helgaas <helgaas@...nel.org>,
Baolu Lu <baolu.lu@...ux.intel.com>,
Bjorn Helgaas <bhelgaas@...gle.com>,
Joerg Roedel <jroedel@...e.de>,
"Matt Fagnani" <matt.fagnani@...l.net>,
Christian König <christian.koenig@....com>,
Kevin Tian <kevin.tian@...el.com>,
Vasant Hegde <vasant.hegde@....com>,
Tony Zhu <tony.zhu@...el.com>, <linux-pci@...r.kernel.org>,
<iommu@...ts.linux.dev>, <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH v3 1/1] PCI: Add translated request only flag for
pci_enable_pasid()
On Tue, 31 Jan 2023 22:36:27 -0400
Jason Gunthorpe <jgg@...dia.com> wrote:
> On Tue, Jan 31, 2023 at 06:14:19PM -0600, Bjorn Helgaas wrote:
>
> > > AMD GPU is one of those devices.
> >
> > I guess you mean the AMD GPU has ATS, PRI, and PASID Capabilities?
> > And furthermore, that the GPU *always* uses Translated addresses with
> > PASID?
>
> I'm not versed in the spec lingo, but the GPU issues MemRd/Wrs with
> the translated bit set and no PASID header - which is the correct form
> for an address that was translated by ATS.
FWIW there is a capability bit and enable bit in the PASID cap/control
registers that says whether a device can/should add a PASID to a
translated request or not. I think the intent is that a host can
sanity check AT requests to make sure the device isn't making them
up. To do that it needs the PASID. Not sure any hosts do this yet
though ;)
Not worth much, but I thought it always sent the PASID so dug out spec
to check (I was wrong as it is both optional and configurable).
>
> To get to that it issues ATS requests, and only the ATS related
> requests will carry the PASID.
>
> ATS related requests always route to the root port, which is why it is
> functionally equivalent to ACS RR/UF in these cases.
>
> Translated requests always route where they are supposed to go, even
> with P2P and things.
>
> > And this applies even if there is no ACS or ACS doesn't support
> > PCI_ACS_RR and PCI_ACS_UF.
> >
> > The black screen happens because ... ?
>
> AMD GPU driver bugs blow up if it cannot setup PASID.
>
> > I couldn't figure out the NULL pointer dereference. I expected it to
> > be from a BUG() or similar in report_iommu_fault(), but I don't see
> > that.
>
> IIRC it is a buggy error unwind handling in the AMD GPU driver.
>
> Jason
Powered by blists - more mailing lists