lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <eb4685e6-04fc-4d21-bd98-2a297c183966@linux.intel.com>
Date: Fri, 9 Jan 2026 16:42:06 -0800
From: "Katiyar, Pooja" <pooja.katiyar@...ux.intel.com>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>,
 Mario Limonciello <superm1@...nel.org>
Cc: "open list:THUNDERBOLT DRIVER" <linux-usb@...r.kernel.org>,
 linux-kernel@...r.kernel.org, Andreas Noever <andreas.noever@...il.com>,
 Yehezkel Bernat <YehezkelShB@...il.com>,
 Pooja Katiyar <pooja.katiyar@...el.com>,
 Rene Sapiens <rene.sapiens@...ux.intel.com>
Subject: Re: [PATCH v2 0/2] thunderbolt: Fix S4 resume incongruities

Hi,

On Thu, Jan 8, 2026 at 11:23:18PM -0800, Mika Westerberg wrote:
> On Thu, Jan 08, 2026 at 01:18:58PM -0600, Mario Limonciello wrote:
>> On 1/8/26 5:42 AM, Mika Westerberg wrote:
>>
>> Let me just share the whole log so you can see the full context.
>>
>> https://gist.github.com/superm1/6798fff44d0875b4ed0fe43d0794f81e
> 
> Thanks!
> 
> [Side note, you seem to have the link trained at Gen2 (20G) instead of Gen3
> (40G).]
> 
> Looking at the dmesg I recalled that there is an internal report about
> similar issue by Pooja and Rene (Cc'd) and it all boils down to this log
> entry:
> 
> [  489.339148] thunderbolt 0000:c6:00.6: 2:13: could not allocate DP tunnel
> 
> They made a hack patch that works it around, see below. I wonder if you
> could try that too? If that's the issue (not releasing HopIDs) then we need
> to figure a way to fix it properly. One suggestion is to release DP
> resources earlier, and of course doing full reset as done here. I would
> prefer "smallest" possible change.
> 
> @Pooja, any updates on your side to this?

Looking at the log "could not allocate DP tunnel", this appears to be
similar to kref synchronization issue during S4 resume that we are
facing. The problem we have identified is during S4 entry, hibernation
image is created first, and then the DP tunnels are freed. This means
the hibernation image still contains the tunnels in their active state.
However, when resuming from S4, the tunnels are restored from the
hibernation image (as active) and then the resume flow reactivates
them again, causing kref count mismatch. This leads to HopID allocation
conflicts and the "could not allocate DP tunnel" error on next
connect/tunnel activation.

The hack patch works around this by preventing double activation via
dprx_started flag. You could try this hack to confirm if it's the same
issue we're dealing with.

For a proper fix, we are working on a patch which releases the DP resources
before saving the hibernation image and creates them again during resume,
managing the resources properly. The patch is currently under review and
testing and will send shortly.


> 
> diff --git a/drivers/thunderbolt/tunnel.c b/drivers/thunderbolt/tunnel.c
> index 28c1e5c062f3..45f7ee940f10 100644
> --- a/drivers/thunderbolt/tunnel.c
> +++ b/drivers/thunderbolt/tunnel.c
> @@ -1084,6 +1084,9 @@ static void tb_dp_dprx_work(struct work_struct *work)
>  
>  static int tb_dp_dprx_start(struct tb_tunnel *tunnel)
>  {
> +	if (tunnel->dprx_started)
> +		return 0;
> +
>  	/*
>  	 * Bump up the reference to keep the tunnel around. It will be
>  	 * dropped in tb_dp_dprx_stop() once the tunnel is deactivated.

Thanks,
Pooja

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ