linux-kernel - Re: [PATCH v4 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <94b08bab-6488-4c4a-9742-30a69972ba50@linux.intel.com>
Date: Fri, 22 Dec 2023 09:56:39 +0800
From: Ethan Zhao <haifeng.zhao@...ux.intel.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: bhelgaas@...gle.com, baolu.lu@...ux.intel.com, dwmw2@...radead.org,
 will@...nel.org, robin.murphy@....com, linux-pci@...r.kernel.org,
 iommu@...ts.linux.dev, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v4 2/2] iommu/vt-d: don's issue devTLB flush request when
 device is disconnected


On 12/21/2023 6:39 PM, Lukas Wunner wrote:
> On Tue, Dec 19, 2023 at 07:51:53PM -0500, Ethan Zhao wrote:
>> For those endpoint devices connect to system via hotplug capable ports,
>> users could request a warm reset to the device by flapping device's link
>> through setting the slot's link control register, as pciehpt_ist() DLLSC
>> interrupt sequence response, pciehp will unload the device driver and
>> then power it off. thus cause an IOMMU devTLB flush request for device to
>> be sent and a long time completion/timeout waiting in interrupt context.
> I think the problem is in the "waiting in interrupt context".
>
> Can you change qi_submit_sync() to *sleep* until the queue is done?
> Instead of busy-waiting in atomic context?

If you read that function carefully, you wouldn't say "sleep" there....

that is 'sync'ed.

>
> Is the hardware capable of sending an interrupt once the queue is done?
> If it is not capable, would it be viable to poll with exponential backoff
> and sleep in-between polling once the polling delay increases beyond, say,
> 10 usec?

I don't know if the polling along sleeping for completion of meanningless

devTLB invalidation request blindly sent to (removed/powered down/link down)

device makes sense or not.

But according to PCIe spec  6.1  10.3.1

"Software ensures no invalidations are issued to a Function when its

  ATS capability is disabled. "

>
> Again, the proposed patch is not a proper solution.  It will paper over
> the issue most of the time but every once in a while someone will still
> get a hard lockup splat and it will then be more difficult to reproduce
> and fix if the proposed patch is accepted.

Could you point out why is not proper ? Is there any other window

the hard lockup still could happen with the ATS capable devcie

supprise_removal case if we checked the connection state first ?

Please help to elaberate it.

>
>
>> [ 4223.822622] CPU: 144 PID: 1422 Comm: irq/57-pciehp Kdump: loaded Tainted: G S
>>           OE    kernel version xxxx
> I don't see any reason to hide the kernel version.
> This isn't Intel Confidential information.
>
Yes, this is the old kernel stack trace, but customer also tried lasted 
6.7rc4

(doesn't work) and the patched 6.7rc4 (fixed).


Thanks,

Ethan

>> [ 4223.822628] Call Trace:
>> [ 4223.822628]  qi_flush_dev_iotlb+0xb1/0xd0
>> [ 4223.822628]  __dmar_remove_one_dev_info+0x224/0x250
>> [ 4223.822629]  dmar_remove_one_dev_info+0x3e/0x50
> __dmar_remove_one_dev_info() was removed by db75c9573b08 in v6.0
> one and a half years ago, so the stack trace appears to be from
> an older kernel version.
>
> Thanks,
>
> Lukas