linux-kernel - Re: [PATCH 0/4] Verify devices transition from D3cold to D0

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <20240618131452.GC1532424@black.fi.intel.com>
Date: Tue, 18 Jun 2024 16:14:52 +0300
From: Mika Westerberg <mika.westerberg@...ux.intel.com>
To: Mario Limonciello <mario.limonciello@....com>
Cc: Bjorn Helgaas <bhelgaas@...gle.com>,
	Mathias Nyman <mathias.nyman@...el.com>,
	Greg Kroah-Hartman <gregkh@...uxfoundation.org>,
	"open list:PCI SUBSYSTEM" <linux-pci@...r.kernel.org>,
	open list <linux-kernel@...r.kernel.org>,
	"open list:USB XHCI DRIVER" <linux-usb@...r.kernel.org>,
	Daniel Drake <drake@...lessos.org>, Gary Li <Gary.Li@....com>
Subject: Re: [PATCH 0/4] Verify devices transition from D3cold to D0

Hi Mario,

On Thu, Jun 13, 2024 at 12:42:00AM -0500, Mario Limonciello wrote:
> Gary has reported that when a dock is plugged into a system at the same
> time the autosuspend delay has tripped that the USB4 stack malfunctions.
> 
> Messages show up like this:
> 
> ```
> thunderbolt 0000:e5:00.6: ring_interrupt_active: interrupt for TX ring 0 is already enabled
> ```
> 
> Furthermore the USB4 router is non-functional at this point.

Once the USB4 domain starts the sleep transition, it cannot be
interrupted by anything so it always should go through full sleep
transition and only then back from sleep.

> Those messages happen because the device is still in D3cold at the time
> that the PCI core handed control back to the USB4 connection manager
> (thunderbolt).

This is weird. Yes we should be getting the wake from the hotplug but
that should happen only after the domain is fully in sleep (D3cold). The
BIOS ACPI code is supposed to deal with this.

> The issue is that it takes time for a device to enter D3cold and do a
> conventional reset, and then more time for it to exit D3cold.
> 
> This appears not to be a new problem; previously there were very similar
> reports from Ryzen XHCI controllers.  Quirks were added for those.
> Furthermore; adding extra logging it's apparent that other PCI devices
> in the system can take more than 10ms to recover from D3cold as well.

They can take anything up to 100ms after the link has trained.