[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <3ea89ca4-c6b2-469f-81fd-00bea2e9cac0@linux.intel.com>
Date: Fri, 29 Dec 2023 17:19:03 +0800
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>,
"bhelgaas@...gle.com" <bhelgaas@...gle.com>,
"baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
"dwmw2@...radead.org" <dwmw2@...radead.org>,
"will@...nel.org" <will@...nel.org>,
"robin.murphy@....com" <robin.murphy@....com>,
"lukas@...ner.de" <lukas@...ner.de>
Cc: "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
"iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
"linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v9 2/5] iommu/vt-d: break out ATS Invalidation if
target device is gone
On 12/29/2023 4:06 PM, Tian, Kevin wrote:
>> From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
>> Sent: Thursday, December 28, 2023 9:03 PM
>>
>> On 12/28/2023 4:30 PM, Tian, Kevin wrote:
>>>> From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
>>>> Sent: Thursday, December 28, 2023 8:17 AM
>>>>
>>>> For those endpoint devices connect to system via hotplug capable ports,
>>>> users could request a warm reset to the device by flapping device's link
>>>> through setting the slot's link control register, as pciehp_ist() DLLSC
>>>> interrupt sequence response, pciehp will unload the device driver and
>>>> then power it off. thus cause an IOMMU device-TLB invalidation (Intel
>>>> VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for device to
>>>> be sent and a long time completion/timeout waiting in interrupt context.
>>> is above describing the behavior of safe removal or surprise removal?
>> bring the link down is a kind of surprise removal for hotplug capable
>>
>> device.
> then it's better to make it clear from beginning that this is about surprise
> removal in which device is removed and cannot respond to on-going
> ATS invalidation request incurred in the removal process.
>
> safe removal should be immune from this problem as the device is still
> responsive in the whole removal process.
>
>>>> [ 4223.822628] Call Trace:
>>>> [ 4223.822628] qi_flush_dev_iotlb+0xb1/0xd0
>>>> [ 4223.822628] __dmar_remove_one_dev_info+0x224/0x250
>>>> [ 4223.822629] dmar_remove_one_dev_info+0x3e/0x50
>>>> [ 4223.822629] intel_iommu_release_device+0x1f/0x30
>>>> [ 4223.822629] iommu_release_device+0x33/0x60
>>>> [ 4223.822629] iommu_bus_notifier+0x7f/0x90
>>>> [ 4223.822630] blocking_notifier_call_chain+0x60/0x90
>>>> [ 4223.822630] device_del+0x2e5/0x420
>>>> [ 4223.822630] pci_remove_bus_device+0x70/0x110
>>>> [ 4223.822630] pciehp_unconfigure_device+0x7c/0x130
> I'm curious why this doesn't occur earlier when the device is
> detached from the driver. At that point presumably the device
> should be detached from the DMA domain which involves
> ATS invalidation too.
>
>>>> while (qi->desc_status[wait_index] != QI_DONE) {
>>>> + /*
>>>> + * if the device-TLB invalidation target device is gone, don't
>>>> + * wait anymore, it might take up to 1min+50%, causes
>>>> system
>>>> + * hang. (see Implementation Note in PCIe spec r6.1 sec
>>>> 10.3.1)
>>>> + */
>>>> + if ((type == QI_DIOTLB_TYPE || type == QI_DEIOTLB_TYPE)
>>>> && pdev)
>>>> + if (!pci_device_is_present(pdev))
>>>> + break;
>>> I'm not sure it's the right thing to do. Such check should be put in the
>>> caller which has the device pointer and can already know it's absent
>>> to not call those cache invalidation helpers.
>> Here is to handle such case, the invalidation request is sent, but the
>>
>> device is just pulled out at that moment.
>>
> one problem - the caller could pass multiple descriptors while type
> only refers to the 1st descriptor.
If the other invalidation request mixed together with ATS invalidation
in the descriptors passed to qi_submit_sync(), would be problem,
so far to Intel VT-d driver, I didn't see such kind of usage, perhaps
will see it later, no one could prevent that.
>
> btw is it an Intel specific problem? A quick glance at smmu driver
> suggests the same problem too:
>
> arm_smmu_atc_inv_domain()
> arm_smmu_cmdq_batch_submit()
> arm_smmu_cmdq_issue_cmdlist()
> arm_smmu_cmdq_poll_until_sync()
> __arm_smmu_cmdq_poll_until_consumed()
>
> /*
> * Wait until the SMMU cons index passes llq->prod.
> * Must be called with the cmdq lock held in some capacity.
> */
> static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
> struct arm_smmu_ll_queue *llq)
>
> is there a more general way to solve it?
Surprise removal could happen anytime, depends on user,
no preparation could be done, so called 'surprise' :(
Thanks,
Ethan
Powered by blists - more mailing lists