[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <8908354f-4962-4ecb-85f1-b1c58ce45385@linux.intel.com>
Date: Tue, 20 Jan 2026 14:49:55 +0800
From: Baolu Lu <baolu.lu@...ux.intel.com>
To: Jinhui Guo <guojinhui.liam@...edance.com>, dwmw2@...radead.org,
joro@...tes.org, will@...nel.org
Cc: iommu@...ts.linux.dev, linux-kernel@...r.kernel.org,
stable@...r.kernel.org
Subject: Re: [PATCH v2 0/2] iommu/vt-d: Skip dev-iotlb flush for inaccessible
PCIe device
On 12/11/25 11:59, Jinhui Guo wrote:
> Hi, all
>
> We hit hard-lockups when the Intel IOMMU waits indefinitely for an ATS invalidation
> that cannot complete, especially under GDR high-load conditions.
>
> 1. Hard-lock when a passthrough PCIe NIC with ATS enabled link-down in Intel IOMMU
> non-scalable mode. Two scenarios exist: NIC link-down with an explicit link-down
> event and link-down without any event.
>
> a) NIC link-down with an explicit link-dow event.
> Call Trace:
> qi_submit_sync
> qi_flush_dev_iotlb
> __context_flush_dev_iotlb.part.0
> domain_context_clear_one_cb
> pci_for_each_dma_alias
> device_block_translation
> blocking_domain_attach_dev
> iommu_deinit_device
> __iommu_group_remove_device
> iommu_release_device
> iommu_bus_notifier
> blocking_notifier_call_chain
> bus_notify
> device_del
> pci_remove_bus_device
> pci_stop_and_remove_bus_device
> pciehp_unconfigure_device
> pciehp_disable_slot
> pciehp_handle_presence_or_link_change
> pciehp_ist
>
> b) NIC link-down without an event - hard-lock on VM destroy.
> Call Trace:
> qi_submit_sync
> qi_flush_dev_iotlb
> __context_flush_dev_iotlb.part.0
> domain_context_clear_one_cb
> pci_for_each_dma_alias
> device_block_translation
> blocking_domain_attach_dev
> __iommu_attach_device
> __iommu_device_set_domain
> __iommu_group_set_domain_internal
> iommu_detach_group
> vfio_iommu_type1_detach_group
> vfio_group_detach_container
> vfio_group_fops_release
> __fput
>
> 2. Hard-lock when a passthrough PCIe NIC with ATS enabled link-down in Intel IOMMU
> scalable mode; NIC link-down without an event hard-locks on VM destroy.
> Call Trace:
> qi_submit_sync
> qi_flush_dev_iotlb
> intel_pasid_tear_down_entry
> device_block_translation
> blocking_domain_attach_dev
> __iommu_attach_device
> __iommu_device_set_domain
> __iommu_group_set_domain_internal
> iommu_detach_group
> vfio_iommu_type1_detach_group
> vfio_group_detach_container
> vfio_group_fops_release
> __fput
>
> Fix both issues with two patches:
> 1. Skip dev-IOTLB flush for inaccessible devices in __context_flush_dev_iotlb() using
> pci_device_is_present().
> 2. Use pci_device_is_present() instead of pci_dev_is_disconnected() to decide when to
> skip ATS invalidation in devtlb_invalidation_with_pasid().
>
> Best Regards,
> Jinhui
>
> ---
> v1:https://lore.kernel.org/all/20251210171431.1589-1-
> guojinhui.liam@...edance.com/
>
> Changelog in v1 -> v2 (suggested by Baolu Lu)
> - Simplify the pci_device_is_present() check in __context_flush_dev_iotlb().
> - Add Cc:stable@...r.kernel.org to both patches.
>
> Jinhui Guo (2):
> iommu/vt-d: Skip dev-iotlb flush for inaccessible PCIe device without
> scalable mode
> iommu/vt-d: Flush dev-IOTLB only when PCIe device is accessible in
> scalable mode
Queued for iommu next.
Thanks,
baolu
Powered by blists - more mailing lists