linux-kernel - Re: USB-C DisplayPort display failing to stay active with Intel Barlow Ridge USB4 controller, power-management related issue?

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20241031095542.587e8aa6@kf-ir16>
Date: Thu, 31 Oct 2024 09:55:42 -0500
From: Aaron Rainbolt <arainbolt@...cus.org>
To: Mika Westerberg <mika.westerberg@...ux.intel.com>
Cc: YehezkelShB@...il.com, michael.jamet@...el.com,
 andreas.noever@...il.com, linux-usb@...r.kernel.org, mmikowski@...cus.org,
 linux-kernel@...r.kernel.org, Gil Fine <gil.fine@...ux.intel.com>
Subject: Re: USB-C DisplayPort display failing to stay active with Intel
 Barlow Ridge USB4 controller, power-management related issue?

On Thu, 24 Oct 2024 18:43:41 +0300
Mika Westerberg <mika.westerberg@...ux.intel.com> wrote:

> Hi,
> 
> On Wed, Oct 23, 2024 at 05:44:13PM -0500, Aaron Rainbolt wrote:
> > Hey, thanks for taking a look! Our tester re-did the tests with
> > kernel 6.12~rc4, and reported the following results doing so, along
> > with another dmesg log. I think your question about xrandr is
> > answered in this report. The dmesg log is attached.
> > 
> > With the vanilla rc4 kernel plus your patch from earlier:
> > 
> > -----
> > 
> > 1. Start with Laptop powered-off
> > 2. Unplug all USB-C connectors.
> > 3. Boot Kubuntu 24.04 with patched kernel 6.12.0-rc4, add cmdline
> >    parameter thunderbolt.dyndbg=+p. All other optional parameters
> > were removed.
> > 4. Log in to normal SDDM to KDE 5.27.11.
> > 5. Open 'Display Settings KCM' to view display detection.
> > 6. Plug in one UBC-C connector attached to 4k display.
> >    * Note these work with Kernel 6.1 and non-Barlow Ridge systems
> > (TBT 4).
> >    * Display does not wake up.
> >    * Display never appears in 'Display Settings KCM.'
> >    * This is NOT desired behavior; display should show.
> >    * (Note: The test results I was given do not mention xrandr here,
> >      however as subsequent results mention it I believe that the
> >      monitor does *not* appear in xrandr here. I will double-check
> >      to be sure.)
> > Notes:
> > 
> > 1. With debug off, the recognition of screens is better, and
> > previously "just worked", at least for one screen.
> > 2. W11 updated works, as do kernels Kernels 6.1 and earlier.
> > 3. W11 from Q4 2022 (pre-update) and kernels 6.5+ do not. With both,
> >    the screens usually initially attach and then time out after
> > 15s.  
> 
> Yea, they work because we added Barlow Ridge support later and this
> problem is specific only on it. None of the integrated or Maple Ridge
> suffers from this "feature".
> 
> Below is an updated patch. This one checks if the DP resource is
> available before it adds it. I hope this covers the case where we get
> the hotplugs even when you have Type-C cable plugged (it should not
> happen, and does not happen on my test system but I have newer firmwre
> so could be firmware related). I wonder if you can try that one too
> with the same flow as above (up to step 6).
> 
> The checkerboard is unrelated issue so I would deal that separarely
> with the nouveau folks. The Thunderbolt/USB4 driver has no visibility
> what is going on with the redriven DP signaling except that it tries
> to keep the thing powered as long as it is needed.
> 
> diff --git a/drivers/thunderbolt/tb.c b/drivers/thunderbolt/tb.c
> index 07a66594e904..ee5c8ba75baa 100644
> --- a/drivers/thunderbolt/tb.c
> +++ b/drivers/thunderbolt/tb.c
> @@ -2113,6 +2113,37 @@ static void tb_exit_redrive(struct tb_port
> *port) }
>  }
>  
> +static void tb_switch_enter_redrive(struct tb_switch *sw)
> +{
> +	struct tb_port *port;
> +
> +	tb_switch_for_each_port(sw, port)
> +		tb_enter_redrive(port);
> +}
> +
> +/*
> + * Called during system and runtime suspend to forcefully exit
> redrive
> + * mode without querying whether the resource is available.
> + */
> +static void tb_switch_exit_redrive(struct tb_switch *sw)
> +{
> +	struct tb_port *port;
> +
> +	if (!(sw->quirks & QUIRK_KEEP_POWER_IN_DP_REDRIVE))
> +		return;
> +
> +	tb_switch_for_each_port(sw, port) {
> +		if (!tb_port_is_dpin(port))
> +			continue;
> +
> +		if (port->redrive) {
> +			port->redrive = false;
> +			pm_runtime_put(&sw->dev);
> +			tb_port_dbg(port, "exit redrive mode\n");
> +		}
> +	}
> +}
> +
>  static void tb_dp_resource_unavailable(struct tb *tb, struct tb_port
> *port, const char *reason)
>  {
> @@ -2157,8 +2188,17 @@ static void tb_dp_resource_available(struct tb
> *tb, struct tb_port *port) return;
>  	}
>  
> -	tb_port_dbg(port, "DP %s resource available after hotplug\n",
> -		    tb_port_is_dpin(port) ? "IN" : "OUT");
> +	if (tb_port_is_dpin(port)) {
> +		/* Verify that the resource is really available */
> +		if (!tb_switch_query_dp_resource(port->sw, port)) {
> +			tb_port_info(port, "got hotplug but DP IN
> resource not available\n");
> +			return;
> +		}
> +		tb_port_dbg(port, "DP IN resource available after
> hotplug\n");
> +	} else {
> +		tb_port_dbg(port, "DP OUT resource available after
> hotplug\n");
> +	}
> +
>  	list_add_tail(&port->list, &tcm->dp_resources);
>  	tb_exit_redrive(port);
>  
> @@ -2987,6 +3027,7 @@ static int tb_start(struct tb *tb, bool reset)
>  	tb_create_usb3_tunnels(tb->root_switch);
>  	/* Add DP IN resources for the root switch */
>  	tb_add_dp_resources(tb->root_switch);
> +	tb_switch_enter_redrive(tb->root_switch);
>  	/* Make the discovered switches available to the userspace */
>  	device_for_each_child(&tb->root_switch->dev, NULL,
>  			      tb_scan_finalize_switch);
> @@ -3002,6 +3043,7 @@ static int tb_suspend_noirq(struct tb *tb)
>  
>  	tb_dbg(tb, "suspending...\n");
>  	tb_disconnect_and_release_dp(tb);
> +	tb_switch_exit_redrive(tb->root_switch);
>  	tb_switch_suspend(tb->root_switch, false);
>  	tcm->hotplug_active = false; /* signal tb_handle_hotplug to
> quit */ tb_dbg(tb, "suspend finished\n");
> @@ -3094,6 +3136,7 @@ static int tb_resume_noirq(struct tb *tb)
>  		tb_dbg(tb, "tunnels restarted, sleeping for
> 100ms\n"); msleep(100);
>  	}
> +	tb_switch_enter_redrive(tb->root_switch);
>  	 /* Allow tb_handle_hotplug to progress events */
>  	tcm->hotplug_active = true;
>  	tb_dbg(tb, "resume finished\n");
> @@ -3157,6 +3200,8 @@ static int tb_runtime_suspend(struct tb *tb)
>  	struct tb_cm *tcm = tb_priv(tb);
>  
>  	mutex_lock(&tb->lock);
> +	tb_disconnect_and_release_dp(tb);
> +	tb_switch_exit_redrive(tb->root_switch);
>  	tb_switch_suspend(tb->root_switch, true);
>  	tcm->hotplug_active = false;
>  	mutex_unlock(&tb->lock);
> @@ -3188,6 +3233,7 @@ static int tb_runtime_resume(struct tb *tb)
>  	tb_restore_children(tb->root_switch);
>  	list_for_each_entry_safe(tunnel, n, &tcm->tunnel_list, list)
>  		tb_tunnel_activate(tunnel);
> +	tb_switch_enter_redrive(tb->root_switch);
>  	tcm->hotplug_active = true;
>  	mutex_unlock(&tb->lock);
>  

Sorry for the delays, been working on testing the patch and had other
things taking priority.

I keep having to manually apply the patches you're sending "by hand" to
the latest mainline kernel because it doesn't apply cleanly. In
particular, I notice that in the very last "segment" of the patch (the
one that applies to tb_runtime_resume), the "list_for_each_entry_safe"
line shows that it's running "tb_tunnel_activate" in your tree, whereas
the upstream 6.12~rc5 code is running "tb_tunnel_restart" there. Would
it be too much of a hassle to ask for a patch that applies cleanly to
the 6.11.5 kernel? That would be very handy since that's the easiest
supported upstream kernel for us to test against, and would make sure
that we're not seeing any weird bugs as a result of kernel sources
mismatch.

I've attached the test results and dmesg logs from using the latest
patch with the 6.12~rc5 kernel. In summary, monitors aren't showing up
in xrandr when initially plugged in, but once "lspci -k" is run, they
are detected and remain powered on.

Thanks again for your help!

View attachment "2024-10-29_6.12.0-rc5_tbt-test-plan.md" of type "text/markdown" (9453 bytes)

View attachment "2024-10-29_6.12.0-rc5_tbt-barlow-ridge-01-dmesg.txt" of type "text/plain" (128390 bytes)

View attachment "2024-10-29_6.12.0-rc5_tbt-barlow-ridge-02-dmesg.txt" of type "text/plain" (111604 bytes)