linux-kernel - Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in pci_dev

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240905093325.GJ1532424@black.fi.intel.com>
Date: Thu, 5 Sep 2024 12:33:25 +0300
From: Mika Westerberg <mika.westerberg@...ux.intel.com>
To: Mario Limonciello <mario.limonciello@....com>
Cc: Bjorn Helgaas <helgaas@...nel.org>, Gary Li <Gary.Li@....com>,
	Mario Limonciello <superm1@...nel.org>,
	Bjorn Helgaas <bhelgaas@...gle.com>,
	Mathias Nyman <mathias.nyman@...el.com>,
	"open list : PCI SUBSYSTEM" <linux-pci@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>,
	"open list : USB XHCI DRIVER" <linux-usb@...r.kernel.org>,
	Daniel Drake <drake@...lessos.org>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	Ilpo Järvinen <ilpo.jarvinen@...ux.intel.com>
Subject: Re: [PATCH v5 2/5] PCI: Check PCI_PM_CTRL instead of PCI_COMMAND in
 pci_dev_wait()

Hi,

On Wed, Sep 04, 2024 at 10:24:26AM -0500, Mario Limonciello wrote:
> On 9/4/2024 07:05, Mika Westerberg wrote:
> > Hi,
> > 
> > On Tue, Sep 03, 2024 at 01:32:30PM -0500, Mario Limonciello wrote:
> > > On 9/3/2024 13:25, Bjorn Helgaas wrote:
> > > > On Tue, Sep 03, 2024 at 12:31:00PM -0500, Mario Limonciello wrote:
> > > > > On 9/3/2024 12:11, Bjorn Helgaas wrote:
> > > > > ...
> > > > 
> > > > > >      8) The USB4 stack sees the device and assumes it is in D0, but it
> > > > > >      seems to still be in D3cold.  What is this based on?  Is there a
> > > > > >      config read that returns ~0 data when it shouldn't?
> > > > > 
> > > > > Yes there is.  From earlier in the thread I have a [log] I shared.
> > > > > 
> > > > > The message emitted is from ring_interrupt_active():
> > > > > 
> > > > > "thunderbolt 0000:e5:00.5: interrupt for TX ring 0 is already enabled"
> > > > 
> > > > Right, that's in the cover letter, but I can't tell from this what the
> > > > ioread32(ring->nhi->iobase + reg) returned.  It looks like this is an
> > > > MMIO read of BAR 0, not a config read.
> > > > 
> > > 
> > > Yeah.  I suppose another way to approach this problem is to make something
> > > else in the call chain poll PCI_PM_CTRL.
> > > 
> > > Polling at the start of nhi_runtime_resume() should also work.  For the
> > > "normal" scenario it would just be a single read to PCI_PM_CTRL.
> > > 
> > > Mika, thoughts?
> 
> We did this experiment to throw code to poll PCI_PM_CTRL at the start of
> nhi_runtime_resume() but this also fails.  From that I would hypothesize the
> device transitioned to D0uninitialized sometime in the middle of
> pci_pm_runtime_resume() before the call to pm->runtime_resume(dev);
> 
> > 
> > I'm starting to wonder if we are looking at the correct place ;-) This
> > reminds me that our PCIe SV people recently reported a couple of Linux
> > related issues which they recommended to fix, and these are on my list
> > but I'll share them because maybe they are related?
> 
> Thanks for sharing those.  We had a try with them but sorry to say no
> improvements to the issue at hand.

Okay, thanks for checking.

Few additional side paths here, though. This is supposed to work so that
once the host router sleep bit is set the driver is supposed to allow
the domain to enter sleep (e.g it should not be waken up before it is
fully transitioned). That's what we do:

1. All tunneled PCIe Root/Downstream ports are in D3.
2. All tunneled USB 3.x ports are in U3.
3. No DisplayPort is tunneled.
4. Thunderbolt driver enables wakes.
5. Thunderbolt driver writes sleep ready bit of the host router.
6. Thunderbolt driver runtime suspend is complete.
7. ACPI method is called (_PS3 or _PR3.OFF) that will trigger the "Sleep
Event".

If between 5 and 7 there is device connected, it should not "abort" the
sequence. Unfortunately this is not explict in the USB4 spec but the
connection manager guide has similar note. Even if the connect happens
there the "Sleep Event" should happen but after that it can trigger
normal wakeup which will then bring everything back.

Would it be possible to enable tracing around these steps so that we
could see if there is hotplug notification somewhere there that is not
expected? Here are instructions how to get pretty accurate trace:

https://github.com/intel/tbtools?tab=readme-ov-file#tracing

Please also take full dmesg.

It is entirely possible that this has nothing to do with the issue but I
think it is worth checking.

The second thing we could try is to check the wake status bits after
this has happened, like:

  # tbdump -r 0 -a <ADAPTER> -vv -N 1 PORT_CS_18

(where <ADAPTER> is the lane 0 adapter of the USB4 port the device was
connected).

The third thing to try is to comment out TB_WAKE_ON_CONNECT in
tb_switch_suspend(). This should result no wake even if the device is
connected. This tells us that it is really the connect on USB4 port that
triggered the wake.

These could (also) explain why the host router appears to be in D3 even
if it should be in D0 already.