linux-kernel - Re: [PATCH v2 0/2] thunderbolt: Fix S4 resume incongruities

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <ad8cf89d-a171-4e72-996e-8b09d16f9017@kernel.org>
Date: Thu, 8 Jan 2026 13:18:58 -0600
From: Mario Limonciello <superm1@...nel.org>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: "open list:THUNDERBOLT DRIVER" <linux-usb@...r.kernel.org>,
 linux-kernel@...r.kernel.org, Andreas Noever <andreas.noever@...il.com>,
 Michael Jamet <michael.jamet@...el.com>,
 Yehezkel Bernat <YehezkelShB@...il.com>
Subject: Re: [PATCH v2 0/2] thunderbolt: Fix S4 resume incongruities

On 1/8/26 5:42 AM, Mika Westerberg wrote:
> On Wed, Jan 07, 2026 at 02:50:54PM -0600, Mario Limonciello wrote:
>> On 1/7/26 3:33 AM, Mika Westerberg wrote:
>>> Hi,
>>>
>>> On Mon, Jan 05, 2026 at 11:37:47PM -0600, Mario Limonciello (AMD) wrote:
>>>> When a machine is restored from S4 if the firmware CM has created
>>>> tunnels there can be an incongruity of expectation from the kernel
>>>> when compared to booting from S5.  This series addresses those.
>>>
>>> I suspect there is no Firmware CM in AMD platforms so this actually means
>>> the BIOS CM, correct?
>>
>> That's correct.
>>
>>>
>>> However, on S4 we actually do reset host router when the "boot kernel" is
>>> started before loading and jumping to the hibernation image.
>>
>> That's only if thunderbolt.ko is built into the kernel or is included in the
>> initramfs before it does the pivot to the hibernation image.
> 
> Ah good point.
> 
>> At least in the tests we were doing it's not part of the boot kernel.
>>
>>> It might be
>>> that this boot kernel tunnel configuration is causing the issues you are
>>> seeing (can you elaborate on those?)
>>
>> The issues manifest "downstream" in the GPU driver.  There are a bunch of
>> aux failures and a non functional display.  Tracing it back the GPU driver
>> isn't alive at the time that the tunnels are attempted to be reconstructed
>> at the moment and so CM tears DP tunnel down and then when GPU driver does
>> come up it is not functional.
>>
>> DP tunnel constructed at:
>>
>> [  486.007194] thunderbolt 0000:c6:00.6: AUX RX path activation complete
>>
>> First DPRx timeout at:
>>
>> [  486.135483] thunderbolt 0000:c6:00.6: 0:6 <-> 2:13 (DP): DPRX read
>> timeout
>>
>> DP tunnel deactivating at:
>>
>>   [  486.331856] thunderbolt 0000:c6:00.6: 0:6 <-> 2:13 (DP): deactivating
> 
> Hmm, we have dprx_timeout by default 12 seconds. How come it tears down the
> tunnel already?

*I believe* it's because of a hot unplug event that occurs from it not 
working.

> 
>>
>> First DPRx DPCD reading starts at:
>>
>> [  486.351765] amdgpu 0000:c4:00.0: amdgpu: [drm] DPIA AUX failed on
>> 0xf0000(10), error 7
> 
> This would have maked it within the 12s if I read the timestamps right.

Let me just share the whole log so you can see the full context.

https://gist.github.com/superm1/6798fff44d0875b4ed0fe43d0794f81e

Notice that GPU driver resume hasn't started yet at the time of the 
first two instances of DPRX timeout.  This is the time that display has 
been brought back up.

[  486.328339] amdgpu 0000:c4:00.0: amdgpu: [drm] DMUB hardware 
initialized: version=0x09001C01