[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZvvW1ua2UjwHIOEN@wunner.de>
Date: Tue, 1 Oct 2024 13:02:46 +0200
From: Lukas Wunner <lukas@...ner.de>
To: AceLan Kao <acelan.kao@...onical.com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>,
linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: pciehp: Fix system hang on resume after hot-unplug
during suspend
On Mon, Sep 30, 2024 at 09:31:53AM +0800, AceLan Kao wrote:
> Lukas Wunner <lukas@...ner.de> 2024 9 28 8:51:
> > - if (pci_get_dsn(pdev) != ctrl->dsn)
> > + dsn = pci_get_dsn(pdev);
> > + if (!PCI_POSSIBLE_ERROR(dsn) &&
> > + dsn != ctrl->dsn)
> > return true;
>
> In my case, the pciehp_device_replaced() returns true from this final check.
> And these are the values I got
> dsn = 0x00000000, ctrl->dsn = 0x7800AA00
> dsn = 0x00000000, ctrl->dsn = 0x21B7D000
Ah because pci_get_dsn() returns 0 if the device is gone.
Below is a modified patch which returns false in that case.
I've only changed:
- dsn = pci_get_dsn(pdev);
- if (!PCI_POSSIBLE_ERROR(dsn) &&
+ if ((dsn = pci_get_dsn(pdev)) &&
+ !PCI_POSSIBLE_ERROR(dsn) &&
> Did some other test
> TBT HDD -> TBT dock -> laptop
> suspend
> TBT HDD -> laptop(replace TBT dock with the TBT HDD)
> resume
> Got the same result as above, looks like it didn't detect the TBT dock
> has been replaced by TBT HDD.
>
> In the origin call trace, unplug TBT dock or replace it with TBT HDD,
> it returns true by the below check
> if (pci_read_config_dword(pdev, PCI_VENDOR_ID, ®) ||
> reg != (pdev->vendor | (pdev->device << 16)) ||
> pci_read_config_dword(pdev, PCI_CLASS_REVISION, ®) ||
> reg != (pdev->revision | (pdev->class << 8)))
> return true;
Hm, that's odd. Why is that? Is reg == 0xffffffff in one of those cases?
I guess that could happen if the Thunderbolt tunnels are not yet
established at that point (i.e. in the ->resume_noirq phase),
but normally they should be. Does this system use ICM-controlled
tunnel management or kernel-native (software-controlled) tunnel
management?
Thanks,
Lukas
Powered by blists - more mailing lists