lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <7b95bc97-db82-4de4-aee2-6bc7685cee1b@linux.intel.com> Date: Mon, 25 Dec 2023 10:35:50 +0800 From: Ethan Zhao <haifeng.zhao@...ux.intel.com> To: Bjorn Helgaas <helgaas@...nel.org> Cc: bhelgaas@...gle.com, baolu.lu@...ux.intel.com, dwmw2@...radead.org, will@...nel.org, robin.murphy@....com, lukas@...ner.de, linux-pci@...r.kernel.org, iommu@...ts.linux.dev, linux-kernel@...r.kernel.org Subject: Re: [RFC PATCH v6 2/4] iommu/vt-d: don's issue devTLB flush request when device is disconnected On 12/25/2023 10:21 AM, Bjorn Helgaas wrote: > On Mon, Dec 25, 2023 at 09:46:26AM +0800, Ethan Zhao wrote: >> On 12/25/2023 6:43 AM, Bjorn Helgaas wrote: >>> On Sun, Dec 24, 2023 at 12:06:55AM -0500, Ethan Zhao wrote: >>>> ... >>>> [ 4211.433662] pcieport 0000:17:01.0: pciehp: Slot(108): Link Down >>>> [ 4211.433664] pcieport 0000:17:01.0: pciehp: Slot(108): Card not present >>>> [ 4223.822591] NMI watchdog: Watchdog detected hard LOCKUP on cpu 144 >>>> ... >>>> [ 4223.822647] Kernel Offset: 0x6400000 from 0xffffffff81000000 (relocation >>>> range: 0xffffffff80000000-0xffffffffbfffffff) >>> The timestamps don't help understand the problem, so you could remove >>> them so they aren't a distraction. >> Lukas said he see the qi_submit_sync takes up to 12 seconds to trigger the >> watchdog. > OK, so the timestamps told us how long the watchdog tolerates. I > don't know how useful that is. I suspect that's not a fixed interval > (probably differs by watchdog and possibly user preference). > >>>> Fix it by checking the device's error_state in >>>> devtlb_invalidation_with_pasid() to avoid sending meaningless devTLB flush >>>> request to link down device that is set to pci_channel_io_perm_failure and >>>> then powered off in >>> A pci_dev_is_disconnected() is racy in this context, so this by itself >>> doesn't look like a complete "fix". >> A quick workaround. > Call it a "quick workaround" then, not a "fix". I'm personally not > usually interested in quick workarounds that are not actually fixes, > but the IOMMU folks would be the ones to decide. > > Maybe this is more of an optimization? If patch 4/4 is a real fix (in > the sense that if you disable the watchdog, you would get correct > results after a long timeout), maybe you could reorder the patches so > 4/4 comes first, and this one becomes an optimization on top of it? I Make sense, will reorder them. > haven't worked though the whole path, so I don't know exactly how > these patches work. > >>>> pciehp_ist() >>>> pciehp_handle_presence_or_link_change() >>>> pciehp_disable_slot() >>>> remove_board() >>>> pciehp_unconfigure_device() >>> There are some interesting steps missing here between >>> pciehp_unconfigure_device() and devtlb_invalidation_with_pasid(). >>> >>> devtlb_invalidation_with_pasid() is Intel-specific. ATS Invalidate >>> Requests are not Intel-specific, so all IOMMU drivers must have to >>> deal with the case of an ATS Invalidate Request where we never receive >>> a corresponding ATS Invalidate Completion. Do other IOMMUs like AMD >>> and ARM have a similar issue? >> So far fix it in Intel vt-d specific path. > If the other IOMMU drivers are vulnerable, I guess they would like to > fix this at the same time and in a similar way if possible. > >>>> For SAVE_REMOVAL unplug, link is alive when iommu releases devcie and >>>> issues devTLB invalidate request, wouldn't trigger such issue. >>>> >>>> This patch works for all links of SURPPRISE_REMOVAL unplug operations. >>> It's not completely obvious that a fix that works for the safe removal >>> case also works for the surprise removal case. Can you briefly >>> explain why it does? >> As I explained to baolu. >> >> For safe_removal, device wouldn't be removed till the whole software >> handling process done, so without this fix, it wouldn't trigger the lockup >> issue, and in safe_removal path, device state isn't set to >> pci_channel_io_perm_failure in pciehp_unconfigure_device() by checking >> 'presence', patch calling this pci_dev_is_disconnected() will return false >> there, wouldn't break the function. so it works. >> >> For suprise_removal, device state is set to pci_channel_io_perm_failure in >> pciehp_unconfigure_device(), means device already be in power-off/link-down >> /removed state, callpci_dev_is_disconnected() hrere will return true to >> break >> >> the function not to send ATS invalidation request anymore, thus avoid the >> further long time waiting trigger the hard lockup. > s/safe_removal/safe removal/ (they are not a single word) > s/suprise_removal/surprise removal/ (misspelled, also not a single word) > >> Do I make it clear enough ? > Needs to be in the commit log, of course. Okay, append to the commit log. Thanks, Ethan >>>> Tested-by: Haorong Ye <yehaorong@...edance.com> >>>> Signed-off-by: Ethan Zhao <haifeng.zhao@...ux.intel.com> >>>> --- >>>> drivers/iommu/intel/pasid.c | 3 +++ >>>> 1 file changed, 3 insertions(+) >>>> >>>> diff --git a/drivers/iommu/intel/pasid.c b/drivers/iommu/intel/pasid.c >>>> index 74e8e4c17e81..7dbee9931eb6 100644 >>>> --- a/drivers/iommu/intel/pasid.c >>>> +++ b/drivers/iommu/intel/pasid.c >>>> @@ -481,6 +481,9 @@ devtlb_invalidation_with_pasid(struct intel_iommu *iommu, >>>> if (!info || !info->ats_enabled) >>>> return; >>>> + if (pci_dev_is_disconnected(to_pci_dev(dev))) >>>> + return; >>>> + >>>> sid = info->bus << 8 | info->devfn; >>>> qdep = info->ats_qdep; >>>> pfsid = info->pfsid; >>> This goes on to call qi_submit_sync(), which contains a restart: loop. >>> I don't know the programming model there, but it looks possible that >>> qi_submit_sync() and qi_check_fault() might not handle the case of an >>> unreachable device correctly. There should be a way to exit that >>> restart: loop in cases where the device doesn't respond at all. >> Yes, fix it in patch[4/4] to break it out when device is gone.
Powered by blists - more mailing lists