linux-kernel - Re: [PATCH] PCI: pciehp: Fix system hang on resume after hot-unplug during suspend

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAFv23Q=QJ+SmpwvzLmzJeCXwYrAHVvTK96Wz7rY=df7VmGbSmw@mail.gmail.com>
Date: Mon, 30 Sep 2024 11:27:28 +0800
From: AceLan Kao <acelan.kao@...onical.com>
To: Lukas Wunner <lukas@...ner.de>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>, Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>, 
	linux-pci@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] PCI: pciehp: Fix system hang on resume after hot-unplug
 during suspend

Lukas Wunner <lukas@...ner.de> 於 2024年9月27日 週五 下午5:28寫道：
>
> On Fri, Sep 27, 2024 at 03:33:50PM +0800, AceLan Kao wrote:
> > Lukas Wunner <lukas@...ner.de> 2024-9-26 9:23
> > > On Thu, Sep 26, 2024 at 08:59:09PM +0800, Chia-Lin Kao (AceLan) wrote:
> > > > Remove unnecessary pci_walk_bus() call in pciehp_resume_noirq(). This
> > > > fixes a system hang that occurs when resuming after a Thunderbolt dock
> > > > with attached thunderbolt storage is unplugged during system suspend.
> > > >
> > > > The PCI core already handles setting the disconnected state for devices
> > > > under a port during suspend/resume.
> > > >
> > > > The redundant bus walk was
> > > > interfering with proper hardware state detection during resume, causing
> > > > a system hang when hot-unplugging daisy-chained Thunderbolt devices.
> >
> > I have no good answer for you now.
> > After enabling some debugging options and debugging lock options, I
> > still didn't get any message.
>
> Have you tried "no_console_suspend" on the kernel command line?
>
>
> > ubuntu@...alhost:~$ lspci -tv
> > -[0000:00]-+-00.0  Intel Corporation Device 6400
> >           +-02.0  Intel Corporation Lunar Lake [Intel Graphics]
> >           +-04.0  Intel Corporation Device 641d
> >           +-05.0  Intel Corporation Device 645d
> >           +-07.0-[01-38]--
> >           +-07.2-[39-70]----00.0-[3a-70]--+-00.0-[3b]--
> >           |                               +-01.0-[3c-4d]--
> >           |                               +-02.0-[4e-5f]----00.0-[4f-50]----01.0-[50]----00.0  Phison Electronics Corporation E12 NVMe Controller
> >           |                               +-03.0-[60-6f]--
> >           |                               \-04.0-[70]--
> >
> > This is Dell WD22TB dock
> > 39:00.0 PCI bridge [0604]: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0b26] (rev 03)
> >        Subsystem: Intel Corporation Thunderbolt 4 Bridge [Goshen Ridge 2020] [8086:0000]
> >
> > This is the TBT storage connects to the dock
> > 50:00.0 Non-Volatile memory controller [0108]: Phison Electronics
> > Corporation E12 NVMe Controller [1987:5012] (rev 01)
> >        Subsystem: Phison Electronics Corporation E12 NVMe Controller [1987:5012]
> >        Kernel driver in use: nvme
> >        Kernel modules: nvme
>
> The lspci output shows another PCIe switch in-between the WD22TB dock and
> the NVMe drive (bus 4e and 4f).  Is that another Thunderbolt device?
> Or is the NVMe drive built into the WD22TB dock and the switch at bus
> 4e and 4f is a non-Thunderbolt PCIe switch in the dock?
>
> I realize now that commit 9d573d19547b ("PCI: pciehp: Detect device
> replacement during system sleep") is a little overzealous because it
> not only reacts to *replaced* devices but also to *unplugged* devices:
> If the device was unplugged, reading the vendor and device ID returns
> 0xffff, which is different from the cached value, so the device is
> assumed to have been replaced even though it's actually been unplugged.
>
> The device replacement check runs in the ->resume_noirq phase.  Later on
> in the ->resume phase, pciehp_resume() calls pciehp_check_presence() to
> check for unplugged devices.  Commit 9d573d19547b inadvertantly reacts
> before pciehp_check_presence() gets a chance to react.  So that's something
> that we should probably change.
>
> I'm not sure though why that would call a hang.  But there is a known issue
> that a deadlock may occur when hot-removing nested PCIe switches (which is
> what you've got here).  Keith Busch recently re-discovered the issue.
> You may want to try if the hang goes away if you apply this patch:
>
> https://lore.kernel.org/all/20240612181625.3604512-2-kbusch@meta.com/
>
> If it does go away then at least we know what the root cause is.
Yes, the 2 patches work.

>
> The patch is a bit hackish, but there's an ongoing effort to tackle the
> problem more thoroughly:
>
> https://lore.kernel.org/all/20240722151936.1452299-1-kbusch@meta.com/
> https://lore.kernel.org/all/20240827192826.710031-1-kbusch@meta.com/
v2 can't be applied clearly, so I made some changes.
And this series doesn't work for me.

>
> Thanks,
>
> Lukas