[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241011183751.7d27c59c@kf-ir16>
Date: Fri, 11 Oct 2024 18:37:51 -0500
From: Aaron Rainbolt <arainbolt@...cus.org>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: YehezkelShB@...il.com, michael.jamet@...el.com,
andreas.noever@...il.com, linux-usb@...r.kernel.org, mmikowski@...cus.org,
linux-kernel@...r.kernel.org, Gil Fine <gil.fine@...ux.intel.com>
Subject: Re: USB-C DisplayPort display failing to stay active with Intel
Barlow Ridge USB4 controller, power-management related issue?
On Fri, 11 Oct 2024 19:38:11 +0300
Mika Westerberg <mika.westerberg@...ux.intel.com> wrote:
> Hi,
>
> On Thu, Oct 10, 2024 at 11:26:56PM -0500, Aaron Rainbolt wrote:
> > > Can you share full dmesg with the repro and
> > > "thunderbolt.dyndbg=+p" in the kernel command line?
> >
> > The full log is very long, so I've included it as an email
> > attachment. The exact steps taken after booting with the requested
> > kernel parameter were:
> >
> > 1. boot with thunderbolt.dyndbg=+p kernel param, no USB-C plugged
> > in. 2. After login, hot-plug two USB-C cables. This time, the
> > displays came up and stayed resident (this happens sometimes)
> > 3. Unplugged both cables.
> > 4. Replugged both. This time, the displays did not show anything.
> > 5. lspci -k "jiggled" the displays and they came back on.
> > 6. After ~15s, the displays blacked out again.
> > 7. Save to the demsg file after about 30s.
> >
> > The laptop's firmware is fully up-to-date. One of the fixes we tried
> > was installing Windows 11, updating the firmware, and then
> > re-installing Kubuntu 24.04. This had no effect on the issue.
> >
> > Notes:
> >
> > * Kernel 6.1 does not exhibit this time out. 6.5 and later do.
> > * Windows 11 had very similar behavior before installing Windows
> > updates. After update, it was fixed.
> > * All distros and W11 were tested on the same hardware with the
> > latest firmware, so we know this is not a hardware failure.
>
> Thanks for the logs and steps!
>
> I now realize that
>
> a75e0684efe5 ("thunderbolt: Keep the domain powered when USB4 port
> is in redrive mode")
>
> was half-baked. Yes it deals with the situation where plugging in
> monitor when the domain is powered. However, it completely misses
> these cases:
>
> * Plug in monitor to the Type-C port when the controller is runtime
> suspended.
> * Boot with monitor plugged in to the Type-C port.
>
> At the end of this email there is a hack patch that tries to solve
> this. Can you try it out? I will be on vacation next week but I'm
> copying my colleague Gil who is familiar with this too. He should be
> able to help you out during my absense.
>
> Couple of notes about the dmesg you shared. They don't affect this
> issue but may cause other issues:
>
> > [ 1.382718] thunderbolt 0000:06:00.0: device links to tunneled
> > native ports are missing!
>
> This is means the BIOS does not implement the USB4 power contract
> which means that USB 3.x and PCIe tunnels will not work as expected
> after power transition.
>
> > [ 1.416488] thunderbolt 0000:06:00.0: 0: NVM version 14.86
>
> This is really old firmware version. My development system for example
> has 56.x so yours might have a bunch of issues that are solved in the
> later versions.
>
> The hack patch below:
>
> diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
> index 07a66594e904..0e424b7661be 100644
> --- a/drivers/thunderbolt/tb.c
> +++ b/drivers/thunderbolt/tb.c
> @@ -2113,6 +2113,37 @@ static void tb_exit_redrive(struct tb_port
> *port) }
> }
>
> +static void tb_switch_enter_redrive(struct tb_switch *sw)
> +{
> + struct tb_port *port;
> +
> + tb_switch_for_each_port(sw, port)
> + tb_enter_redrive(port);
> +}
> +
> +/*
> + * Called during system and runtime suspend to forcefully exit
> redrive
> + * mode without querying whether the resource is available.
> + */
> +static void tb_switch_exit_redrive(struct tb_switch *sw)
> +{
> + struct tb_port *port;
> +
> + if (!(sw->quirks & QUIRK_KEEP_POWER_IN_DP_REDRIVE))
> + return;
> +
> + tb_switch_for_each_port(sw, port) {
> + if (!tb_port_is_dpin(port))
> + continue;
> +
> + if (port->redrive) {
> + port->redrive = false;
> + pm_runtime_put(&sw->dev);
> + tb_port_dbg(port, "exit redrive mode\n");
> + }
> + }
> +}
> +
> static void tb_dp_resource_unavailable(struct tb *tb, struct tb_port
> *port, const char *reason)
> {
> @@ -2987,6 +3018,7 @@ static int tb_start(struct tb *tb, bool reset)
> tb_create_usb3_tunnels(tb->root_switch);
> /* Add DP IN resources for the root switch */
> tb_add_dp_resources(tb->root_switch);
> + tb_switch_enter_redrive(tb->root_switch);
> /* Make the discovered switches available to the userspace */
> device_for_each_child(&tb->root_switch->dev, NULL,
> tb_scan_finalize_switch);
> @@ -3002,6 +3034,7 @@ static int tb_suspend_noirq(struct tb *tb)
>
> tb_dbg(tb, "suspending...\n");
> tb_disconnect_and_release_dp(tb);
> + tb_switch_exit_redrive(tb->root_switch);
> tb_switch_suspend(tb->root_switch, false);
> tcm->hotplug_active = false; /* signal tb_handle_hotplug to
> quit */ tb_dbg(tb, "suspend finished\n");
> @@ -3094,6 +3127,7 @@ static int tb_resume_noirq(struct tb *tb)
> tb_dbg(tb, "tunnels restarted, sleeping for
> 100ms\n"); msleep(100);
> }
> + tb_switch_enter_redrive(tb->root_switch);
> /* Allow tb_handle_hotplug to progress events */
> tcm->hotplug_active = true;
> tb_dbg(tb, "resume finished\n");
> @@ -3157,6 +3191,8 @@ static int tb_runtime_suspend(struct tb *tb)
> struct tb_cm *tcm = tb_priv(tb);
>
> mutex_lock(&tb->lock);
> + tb_disconnect_and_release_dp(tb);
> + tb_switch_exit_redrive(tb->root_switch);
> tb_switch_suspend(tb->root_switch, true);
> tcm->hotplug_active = false;
> mutex_unlock(&tb->lock);
> @@ -3188,6 +3224,7 @@ static int tb_runtime_resume(struct tb *tb)
> tb_restore_children(tb->root_switch);
> list_for_each_entry_safe(tunnel, n, &tcm->tunnel_list, list)
> tb_tunnel_activate(tunnel);
> + tb_switch_enter_redrive(tb->root_switch);
> tcm->hotplug_active = true;
> mutex_unlock(&tb->lock);
>
Attached are the test results (including dmesg log) after testing with
our version of the 6.8 kernel with this patch applied. Sadly we didn't
have time to test with 6.11.2 as the machines we were testing on had to
be shipped to customers and we found a working stop-gap solution in the
mean time. The test that we did, along with it's results, are as
follows:
1. Start with Laptop powered-off
2. Unplug all USB-C connectors.
3. Attempt to update firmware using Windows.
Version remains 'thunderbolt 0000:06:00.0: 0: NVM version 14.86'
4. Boot Kubuntu 24.04 with patched kernel, add cmdline parameter
thunderbolt.dyndbg=+p Note that a "thunderbolt.kf_force_redrive=1"
kernel parameter was also included by mistake, but it is ignored in
this kernel. (That was a leftover of a testing kernel we made.)
5. Log in to normal SDDM to KDE 5.27.11.
6. Open 'Display Settings KCM' to view display
detection.
7. Plug in 2 x UBC-C connectors attached to 4k displays.
- Note these work with Kernel 6.1 and non-Barlow Ridge systems (TB
4).
- Displays wake up, but never show graphics signal. They timeout
and resume powersave mode.
- Displays never appear in 'Display Settings KCM.'
- This is NOT desired behavior; displays should show.
8. Open a terminal and run 'lspci -k'
- Both displays are activated and remain active.
- There is no timeout.
- This is desired behavior.
9. Unplug USB-C connectors on at a time.
- System recognizes unplug, retains other display
- This is desired behavior.
10. Replug USB-C #1 attached to 4k display
- System does not show reattached display
- This is NOT desired behavior; display should show.
11. Replug USB-C #2
- System does not show reattached display
- This is NOT desired behavior; display should show.
12. Sleep and resume laptop (this powers-down and repowers USB-C)
- Displays are restored in 'Display Settings KCM'
- Displays show graphical output
- Display remain connected
- This is desired behavior.
13. Replug cables again
- Displays now are removed and shown as expected.
- This is desired behavior.
14. Write dmesg log attached here.
Thanks again for your assistance. We should have a dedicated testing
machine in a few days, which will let us repeat our tests on the latest
stable upstream kernel.
View attachment "2024-10-11_dmesg-thunderbolt.dynaorg+p_fw14.86.log" of type "text/x-log" (139633 bytes)
Powered by blists - more mailing lists