lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20251209054141.1975982-1-acelan.kao@canonical.com>
Date: Tue,  9 Dec 2025 13:41:41 +0800
From: "Chia-Lin Kao (AceLan)" <acelan.kao@...onical.com>
To: Andreas Noever <andreas.noever@...il.com>,
	Mika Westerberg <westeri@...nel.org>,
	Yehezkel Bernat <YehezkelShB@...il.com>,
	linux-usb@...r.kernel.org,
	linux-kernel@...r.kernel.org
Subject: [PATCH] [RFC] thunderbolt: Add delay for Dell U2725QE link width

When plugging in a Dell U2725QE Thunderbolt monitor, the kernel produces
a call trace during initial enumeration. The device automatically
disconnects and reconnects ~3 seconds later, and works correctly on the
second attempt.

Issue Description:
==================
The Dell U2725QE (USB4 device 8087:b26) requires additional time during
link width negotiation from single lane to dual lane. On first plug, the
following sequence occurs:

1. Port state reaches TB_PORT_UP (link established, single lane)
2. Path activation begins immediately
3. tb_path_activate() - > tb_port_write() returns -ENOTCONN (error -107)
4. Call trace is generated at tb_path_activate()
5. Device disconnects/reconnects automatically after ~3 seconds
6. Second attempt succeeds with full dual-lane bandwidth

First attempt dmesg (failure):
-------------------------------
[   36.030347] thunderbolt 0000:c7:00.6: 2:16: available bandwidth for new USB3 tunnel 9000/9000 Mb/s
[   36.030613] thunderbolt 0000:c7:00.6: 2: USB3 tunnel creation failed
[   36.031530] thunderbolt 0000:c7:00.6: PCIe Down path activation failed
[   36.031531] WARNING: drivers/thunderbolt/path.c:589 at 0x0, CPU#12: pool-/usr/libex/3145

Second attempt dmesg (success):
--------------------------------
[   40.440012] thunderbolt 0000:c7:00.6: 2:16: available bandwidth for new USB3 tunnel 36000/36000 Mb/s
[   40.440261] thunderbolt 0000:c7:00.6: 2:16: maximum required bandwidth for USB3 tunnel 9000 Mb/s
[   40.440269] thunderbolt 0000:c7:00.6: 0:4 <-> 2:16 (USB3): activating
[   40.440271] thunderbolt 0000:c7:00.6: 0:4 <-> 2:16 (USB3): allocating initial bandwidth 9000/9000 Mb/s

The bandwidth difference (9000 vs 36000 Mb/s) indicates the first attempt
occurs while the link is still in single-lane mode.

Root Cause Analysis:
====================
The error originates from the Thunderbolt/USB4 device hardware itself:

1. Port config space read/write returns TB_CFG_ERROR_PORT_NOT_CONNECTED
2. This gets translated to -ENOTCONN in tb_cfg_get_error()
3. The port's control channel is temporarily unavailable during state
   transition from single lane to dual lane (lane bonding)

The comment in drivers/thunderbolt/ctl.c explains this is expected:
  "Port is not connected. This can happen during surprise removal.
   Do not warn."

Attempted Solutions:
====================
1. Retry logic on -ENOTCONN in tb_path_activate():
   Result: Caused host port (0:0) lockup with hundreds of "downstream
   port is locked" errors. Rejected by user.

2. Increased tb_port_wait_for_link_width() timeout from 100ms to 3000ms:
   Result: Did not resolve the issue. The timeout increase alone is
   insufficient because the port state hasn't reached TB_PORT_UP when
   lane bonding is attempted.

3. Added msleep(2000) at various points in enumeration flow:
   Locations tested:
   - Before tb_switch_configure(): Works ✓
   - Before tb_switch_add(): Works ✓
   - Before usb4_port_hotplug_enable(): Works ✓
   - After tb_switch_add(): Doesn't work ✗
   - In tb_configure_link(): Doesn't work ✗
   - In tb_switch_lane_bonding_enable(): Doesn't work ✗
   - In tb_port_wait_for_link_width(): Doesn't work ✗

   The pattern shows the delay must occur BEFORE hotplug enable, which
   happens early in tb_switch_port_hotplug_enable() -> usb4_port_hotplug_enable().

Current Workaround:
===================
Add a 2-second delay in tb_wait_for_port() when the port state reaches
TB_PORT_UP. This is the earliest point where we know:
- The link is physically established
- The device is responsive
- But lane width negotiation may still be in progress

This location is chosen because:
1. It's called during port enumeration before any tunnel creation
2. The port has just transitioned to TB_PORT_UP state
3. Allows sufficient time for lane bonding to complete
4. Avoids affecting other code paths

Testing Results:
================
With this patch:
- No call trace on first plug
- Device enumerates correctly on first attempt
- Full bandwidth (36000 Mb/s) available immediately
- No disconnect/reconnect cycle
- USB and PCIe tunnels create successfully

Without this patch:
- Call trace on every first plug
- Only 9000 Mb/s bandwidth (single lane) on first attempt
- Automatic disconnect/reconnect after ~3 seconds
- Second attempt works with 36000 Mb/s

Discussion Points for RFC:
===========================
1. Is a fixed 2-second delay acceptable, or should we poll for a
   specific hardware state?

2. Should we check PORT_CS_18_TIP (Transition In Progress) bit instead
   of using a fixed delay?

3. Is there a better location for this delay in the enumeration flow?

4. Should this be device-specific (based on vendor/device ID) or apply
   to all USB4 devices?

5. The 100ms timeout in tb_switch_lane_bonding_enable() may be too
   short for other devices as well. Should we increase it universally?

Hardware Details:
=================
Device: Dell U2725QE Thunderbolt Monitor
USB4 Router: 8087:b26 (Intel USB4 controller)
Host: AMD Thunderbolt 4 controller (0000:c7:00.6)

Signed-off-by: Chia-Lin Kao (AceLan) <acelan.kao@...onical.com>
---
Full dmesg log available at: https://paste.ubuntu.com/p/CXs2T4XzZ3/
---
 drivers/thunderbolt/switch.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/thunderbolt/switch.c b/drivers/thunderbolt/switch.c
index b3948aad0b955..e0c65e5fb0dca 100644
--- a/drivers/thunderbolt/switch.c
+++ b/drivers/thunderbolt/switch.c
@@ -530,6 +530,8 @@ int tb_wait_for_port(struct tb_port *port, bool wait_if_unplugged)
 			return 0;
 
 		case TB_PORT_UP:
+			msleep(2000);
+			fallthrough;
 		case TB_PORT_TX_CL0S:
 		case TB_PORT_RX_CL0S:
 		case TB_PORT_CL1:
-- 
2.43.0


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ