linux-kernel - Re: [RFC PATCH v9 2/5] iommu/vt-d: break out ATS Invalidation if target device is gone

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <3ea89ca4-c6b2-469f-81fd-00bea2e9cac0@linux.intel.com>
Date: Fri, 29 Dec 2023 17:19:03 +0800
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: "Tian, Kevin" <kevin.tian@...el.com>,
 "bhelgaas@...gle.com" <bhelgaas@...gle.com>,
 "baolu.lu@...ux.intel.com" <baolu.lu@...ux.intel.com>,
 "dwmw2@...radead.org" <dwmw2@...radead.org>,
 "will@...nel.org" <will@...nel.org>,
 "robin.murphy@....com" <robin.murphy@....com>,
 "lukas@...ner.de" <lukas@...ner.de>
Cc: "linux-pci@...r.kernel.org" <linux-pci@...r.kernel.org>,
 "iommu@...ts.linux.dev" <iommu@...ts.linux.dev>,
 "linux-kernel@...r.kernel.org" <linux-kernel@...r.kernel.org>
Subject: Re: [RFC PATCH v9 2/5] iommu/vt-d: break out ATS Invalidation if
 target device is gone


On 12/29/2023 4:06 PM, Tian, Kevin wrote:
>> From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
>> Sent: Thursday, December 28, 2023 9:03 PM
>>
>> On 12/28/2023 4:30 PM, Tian, Kevin wrote:
>>>> From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
>>>> Sent: Thursday, December 28, 2023 8:17 AM
>>>>
>>>> For those endpoint devices connect to system via hotplug capable ports,
>>>> users could request a warm reset to the device by flapping device's link
>>>> through setting the slot's link control register, as pciehp_ist() DLLSC
>>>> interrupt sequence response, pciehp will unload the device driver and
>>>> then power it off. thus cause an IOMMU device-TLB invalidation (Intel
>>>> VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for device to
>>>> be sent and a long time completion/timeout waiting in interrupt context.
>>> is above describing the behavior of safe removal or surprise removal?
>> bring the link down is a kind of surprise removal for hotplug capable
>>
>> device.
> then it's better to make it clear from beginning that this is about surprise
> removal in which device is removed and cannot respond to on-going
> ATS invalidation request incurred in the removal process.
>
> safe removal should be immune from this problem as the device is still
> responsive in the whole removal process.
>
>>>> [ 4223.822628] Call Trace:
>>>> [ 4223.822628]  qi_flush_dev_iotlb+0xb1/0xd0
>>>> [ 4223.822628]  __dmar_remove_one_dev_info+0x224/0x250
>>>> [ 4223.822629]  dmar_remove_one_dev_info+0x3e/0x50
>>>> [ 4223.822629]  intel_iommu_release_device+0x1f/0x30
>>>> [ 4223.822629]  iommu_release_device+0x33/0x60
>>>> [ 4223.822629]  iommu_bus_notifier+0x7f/0x90
>>>> [ 4223.822630]  blocking_notifier_call_chain+0x60/0x90
>>>> [ 4223.822630]  device_del+0x2e5/0x420
>>>> [ 4223.822630]  pci_remove_bus_device+0x70/0x110
>>>> [ 4223.822630]  pciehp_unconfigure_device+0x7c/0x130
> I'm curious why this doesn't occur earlier when the device is
> detached from the driver. At that point presumably the device
> should be detached from the DMA domain which involves
> ATS invalidation too.
>
>>>>    	while (qi->desc_status[wait_index] != QI_DONE) {
>>>> +		/*
>>>> +		 * if the device-TLB invalidation target device is gone, don't
>>>> +		 * wait anymore, it might take up to 1min+50%, causes
>>>> system
>>>> +		 * hang. (see Implementation Note in PCIe spec r6.1 sec
>>>> 10.3.1)
>>>> +		 */
>>>> +		if ((type == QI_DIOTLB_TYPE || type == QI_DEIOTLB_TYPE)
>>>> && pdev)
>>>> +			if (!pci_device_is_present(pdev))
>>>> +				break;
>>> I'm not sure it's the right thing to do. Such check should be put in the
>>> caller which has the device pointer and can already know it's absent
>>> to not call those cache invalidation helpers.
>> Here is to handle such case, the invalidation request is sent, but the
>>
>> device is just pulled out at that moment.
>>
> one problem - the caller could pass multiple descriptors while type
> only refers to the 1st descriptor.

If the other invalidation request mixed together with ATS invalidation

in the descriptors passed to qi_submit_sync(), would be problem,

so far to Intel VT-d driver, I didn't see such kind of usage, perhaps

will see it later, no one could prevent that.

>
> btw is it an Intel specific problem? A quick glance at smmu driver
> suggests the same problem too:
>
>    arm_smmu_atc_inv_domain()
>      arm_smmu_cmdq_batch_submit()
>        arm_smmu_cmdq_issue_cmdlist()
>          arm_smmu_cmdq_poll_until_sync()
>            __arm_smmu_cmdq_poll_until_consumed()
>
> /*
>   * Wait until the SMMU cons index passes llq->prod.
>   * Must be called with the cmdq lock held in some capacity.
>   */
> static int __arm_smmu_cmdq_poll_until_consumed(struct arm_smmu_device *smmu,
>                                                 struct arm_smmu_ll_queue *llq)
>
> is there a more general way to solve it?

Surprise removal could happen anytime, depends on user,

no preparation could be done, so called 'surprise' :(


Thanks,

Ethan