lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <BN9PR11MB52763E38B4C8B59C9A9AD9E18CA8A@BN9PR11MB5276.namprd11.prod.outlook.com>
Date: Thu, 18 Dec 2025 08:04:20 +0000
From: "Tian, Kevin" <kevin.tian@...el.com>
To: "Guo, Jinhui" <guojinhui.liam@...edance.com>, "dwmw2@...radead.org"
	<dwmw2@...radead.org>, "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
	"joro@...tes.org" <joro@...tes.org>, "will@...nel.org" <will@...nel.org>
CC: "Guo, Jinhui" <guojinhui.liam@...edance.com>, "iommu@...ts.linux.dev"
	<iommu@...ts.linux.dev>, "linux-kernel@...r.kernel.org"
	<linux-kernel@...r.kernel.org>, "stable@...r.kernel.org"
	<stable@...r.kernel.org>
Subject: RE: [PATCH v2 2/2] iommu/vt-d: Flush dev-IOTLB only when PCIe device
 is accessible in scalable mode

> From: Jinhui Guo <guojinhui.liam@...edance.com>
> Sent: Thursday, December 11, 2025 12:00 PM
> 
> Commit 4fc82cd907ac ("iommu/vt-d: Don't issue ATS Invalidation
> request when device is disconnected") relies on
> pci_dev_is_disconnected() to skip ATS invalidation for
> safely-removed devices, but it does not cover link-down caused
> by faults, which can still hard-lock the system.

According to the commit msg it actually tries to fix the hard lockup
with surprise removal. For safe removal the device is not removed
before invalidation is done:

"
    For safe removal, device wouldn't be removed until the whole software
    handling process is done, it wouldn't trigger the hard lock up issue
    caused by too long ATS Invalidation timeout wait.
"

Can you help articulate the problem especially about the part
'link-down caused by faults"? What are those faults? How are
they different from the said surprise removal in the commit
msg to not set pci_dev_is_disconnected()?

> 
> For example, if a VM fails to connect to the PCIe device,

'failed' for what reason?

> "virsh destroy" is executed to release resources and isolate
> the fault, but a hard-lockup occurs while releasing the group fd.
> 
> Call Trace:
>  qi_submit_sync
>  qi_flush_dev_iotlb
>  intel_pasid_tear_down_entry
>  device_block_translation
>  blocking_domain_attach_dev
>  __iommu_attach_device
>  __iommu_device_set_domain
>  __iommu_group_set_domain_internal
>  iommu_detach_group
>  vfio_iommu_type1_detach_group
>  vfio_group_detach_container
>  vfio_group_fops_release
>  __fput
> 
> Although pci_device_is_present() is slower than
> pci_dev_is_disconnected(), it still takes only ~70 µs on a
> ConnectX-5 (8 GT/s, x2) and becomes even faster as PCIe speed
> and width increase.
> 
> Besides, devtlb_invalidation_with_pasid() is called only in the
> paths below, which are far less frequent than memory map/unmap.
> 
> 1. mm-struct release
> 2. {attach,release}_dev
> 3. set/remove PASID
> 4. dirty-tracking setup
> 

surprise removal can happen at any time, e.g. after the check of
pci_device_is_present(). In the end we need the logic in
qi_check_fault() to check the presence upon ITE timeout error
received to break the infinite loop. So in your case even with
that logici in place you still observe lockup (probably due to
hardware ITE timeout is longer than the lockup detection on 
the CPU?

In any case this change cannot 100% fix the lockup. It just
reduces the possibility which should be made clear.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ