lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251117230414.GA2537490@bhelgaas>
Date: Mon, 17 Nov 2025 17:04:14 -0600
From: Bjorn Helgaas <helgaas@...nel.org>
To: Nicolin Chen <nicolinc@...dia.com>
Cc: joro@...tes.org, rafael@...nel.org, bhelgaas@...gle.com,
	alex@...zbot.org, jgg@...dia.com, kevin.tian@...el.com,
	will@...nel.org, robin.murphy@....com, lenb@...nel.org,
	baolu.lu@...ux.intel.com, linux-arm-kernel@...ts.infradead.org,
	iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
	linux-acpi@...r.kernel.org, linux-pci@...r.kernel.org,
	kvm@...r.kernel.org, patches@...ts.linux.dev,
	pjaroszynski@...dia.com, vsethi@...dia.com, etzhao1900@...il.com
Subject: Re: [PATCH v5 4/5] iommu: Introduce iommu_dev_reset_prepare() and
 iommu_dev_reset_done()

On Mon, Nov 10, 2025 at 09:12:54PM -0800, Nicolin Chen wrote:
> PCIe permits a device to ignore ATS invalidation TLPs, while processing a
> reset. This creates a problem visible to the OS where an ATS invalidation
> command will time out. E.g. an SVA domain will have no coordination with a
> reset event and can racily issue ATS invalidations to a resetting device.

s/TLPs, while/TLPs while/

> The OS should do something to mitigate this as we do not want production
> systems to be reporting critical ATS failures, especially in a hypervisor
> environment. Broadly, OS could arrange to ignore the timeouts, block page
> table mutations to prevent invalidations, or disable and block ATS.
> 
> The PCIe spec in sec 10.3.1 IMPLEMENTATION NOTE recommends to disable and
> block ATS before initiating a Function Level Reset. It also mentions that
> other reset methods could have the same vulnerability as well.
> 
> Provide a callback from the PCI subsystem that will enclose the reset and
> have the iommu core temporarily change all the attached domain to BLOCKED.
> After attaching a BLOCKED domain, IOMMU hardware would fence any incoming
> ATS queries. And IOMMU drivers should also synchronously stop issuing new
> ATS invalidations and wait for all ATS invalidations to complete. This can
> avoid any ATS invaliation timeouts.
> 
> However, if there is a domain attachment/replacement happening during an
> ongoing reset, ATS routines may be re-activated between the two function
> calls. So, introduce a new resetting_domain in the iommu_group structure
> to reject any concurrent attach_dev/set_dev_pasid call during a reset for
> a concern of compatibility failure. Since this changes the behavior of an
> attach operation, update the uAPI accordingly.
> 
> Note that there are two corner cases:
>  1. Devices in the same iommu_group
>     Since an attachment is always per iommu_group, disallowing one device
>     to switch domains (or HWPTs in iommufd) would have to disallow others
>     in the same iommu_group to switch domains as well. So, play safe by
>     preventing a shared iommu_group from going through the iommu reset.
>  2. SRIOV devices that its PF is resetting while its VF isn't

Slightly awkward.  Maybe:

  2. An SR-IOV PF that is being reset while its VF is not

(Obviously resetting a PF destroys all the VFs, which I guess is what
you're hinting at below.)

>     In such case, the VF itself is already broken. So, there is no point
>     in preventing PF from going through the iommu reset.

> + * iommu_dev_reset_prepare() - Block IOMMU to prepare for a device reset
> + * @dev: device that is going to enter a reset routine
> + *
> + * When certain device is entering a reset routine, it wants to block any IOMMU
> + * activity during the reset routine. This includes blocking any translation as
> + * well as cache invalidation (especially the device cache).
> + *
> + * This function attaches all RID/PASID of the device's to IOMMU_DOMAIN_BLOCKED
> + * allowing any blocked-domain-supporting IOMMU driver to pause translation and
> + * cahce invalidation, but leaves the software domain pointers intact so later

s/cahce/cache/

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ