[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <BN9PR11MB5276D40AC7105D19F4E54FA18CD5A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Fri, 21 Nov 2025 07:59:08 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: Nicolin Chen <nicolinc@...dia.com>, "robin.murphy@....com"
<robin.murphy@....com>, "joro@...tes.org" <joro@...tes.org>,
"afael@...nel.org" <afael@...nel.org>, "bhelgaas@...gle.com"
<bhelgaas@...gle.com>, "alex@...zbot.org" <alex@...zbot.org>,
"jgg@...dia.com" <jgg@...dia.com>
CC: "will@...nel.org" <will@...nel.org>, "lenb@...nel.org" <lenb@...nel.org>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"linux-arm-kernel@...ts.infradead.org"
<linux-arm-kernel@...ts.infradead.org>, "iommu@...ts.linux.dev"
<iommu@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
<linux-kernel@...r.kernel.org>, "linux-acpi@...r.kernel.org"
<linux-acpi@...r.kernel.org>, "linux-pci@...r.kernel.org"
<linux-pci@...r.kernel.org>, "kvm@...r.kernel.org" <kvm@...r.kernel.org>,
"patches@...ts.linux.dev" <patches@...ts.linux.dev>, "Jaroszynski, Piotr"
<pjaroszynski@...dia.com>, "Sethi, Vikram" <vsethi@...dia.com>,
"helgaas@...nel.org" <helgaas@...nel.org>, "etzhao1900@...il.com"
<etzhao1900@...il.com>
Subject: RE: [PATCH v6 4/5] iommu: Introduce
pci_dev_reset_iommu_prepare/done()
> From: Nicolin Chen <nicolinc@...dia.com>
> Sent: Wednesday, November 19, 2025 8:52 AM
>
> PCIe permits a device to ignore ATS invalidation TLPs while processing a
> reset. This creates a problem visible to the OS where an ATS invalidation
> command will time out. E.g. an SVA domain will have no coordination with a
> reset event and can racily issue ATS invalidations to a resetting device.
>
> The OS should do something to mitigate this as we do not want production
> systems to be reporting critical ATS failures, especially in a hypervisor
> environment. Broadly, OS could arrange to ignore the timeouts, block page
> table mutations to prevent invalidations, or disable and block ATS.
>
> The PCIe r6.0, sec 10.3.1 IMPLEMENTATION NOTE recommends SW to
> disable and
> block ATS before initiating a Function Level Reset. It also mentions that
> other reset methods could have the same vulnerability as well.
>
> Provide a callback from the PCI subsystem that will enclose the reset and
> have the iommu core temporarily change all the attached RID/PASID
> domains
> group->blocking_domain so that the IOMMU hardware would fence any
> incoming
> ATS queries. And IOMMU drivers should also synchronously stop issuing new
> ATS invalidations and wait for all ATS invalidations to complete. This can
> avoid any ATS invaliation timeouts.
>
> However, if there is a domain attachment/replacement happening during an
> ongoing reset, ATS routines may be re-activated between the two function
> calls. So, introduce a new resetting_domain in the iommu_group structure
> to reject any concurrent attach_dev/set_dev_pasid call during a reset for
> a concern of compatibility failure. Since this changes the behavior of an
> attach operation, update the uAPI accordingly.
>
> Note that there are two corner cases:
> 1. Devices in the same iommu_group
> Since an attachment is always per iommu_group, this means that any
> sibling devices in the iommu_group cannot change domain, to prevent
> race conditions.
> 2. An SR-IOV PF that is being reset while its VF is not
> In such case, the VF itself is already broken. So, there is no point
> in preventing PF from going through the iommu reset.
>
> Reviewed-by: Lu Baolu <baolu.lu@...ux.intel.com>
> Signed-off-by: Nicolin Chen <nicolinc@...dia.com>
Reviewed-by: Kevin Tian <kevin.tian@...el.com>
Powered by blists - more mailing lists