[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZfApxyVAJMK4bL8O@hovoldconsulting.com>
Date: Tue, 12 Mar 2024 11:09:11 +0100
From: Johan Hovold <johan@...nel.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc: freedreno@...ts.freedesktop.org, Rob Clark <robdclark@...il.com>,
Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
Sean Paul <sean@...rly.run>,
Marijn Suijten <marijn.suijten@...ainline.org>,
David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
Kuogee Hsieh <quic_khsieh@...cinc.com>,
dri-devel@...ts.freedesktop.org, swboyd@...omium.org,
quic_jesszhan@...cinc.com, quic_parellan@...cinc.com,
quic_bjorande@...cinc.com, Rob Clark <robdclark@...omium.org>,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/msm/dp: move link_ready out of HPD event thread
On Fri, Mar 08, 2024 at 01:45:32PM -0800, Abhinav Kumar wrote:
> There are cases where the userspace might still send another
> frame after the HPD disconnect causing a modeset cycle after
> a disconnect. This messes the internal state machine of MSM DP driver
> and can lead to a crash as there can be an imbalance between
> bridge_disable() and bridge_enable().
Can you be more specific here? What steps would lead to this issue and
how exactly does is mess with the state machine? Is there an easy way
to reproduce it (e.g. by instrumenting the code with some sleep)?
The hotplug code is really convoluted and having a clear description of
the problem is needed to evaluate the patch (including when revisiting
it some time from now when I've forgotten about how this state machine
works).
As you know, we ran into a related issue on sc8280xp (X13s) since
6.8-rc1, but that did not involve any user space interaction at all.
For reference, there are some more details in this thread:
https://lore.kernel.org/all/Ze8Ke_M2xHyPYCu-@hovoldconsulting.com/
> This was also previously reported on [1] for which [2] was posted
> and helped resolve the issue by rejecting commits if the DP is not
> in connected state.
>
> The change resolved the bug but there can also be another race condition.
> If hpd_event_thread does not pick up the EV_USER_NOTIFICATION and process it
> link_ready will also not be set to false allowing the frame to sneak in.
How could the event thread fail to pick up the notification event? Or do
you mean there's a race window before it has been processed?
> Lets move setting link_ready outside of hpd_event_thread() processing to
> eliminate a window of race condition.
As we discussed in thread above, this patch does not eliminate the race,
even if it may reduce the race window.
> [1] : https://gitlab.freedesktop.org/drm/msm/-/issues/17
> [2] : https://lore.kernel.org/all/1664408211-25314-1-git-send-email-quic_khsieh@quicinc.com/
>
> Fixes: 8a3b4c17f863 ("drm/msm/dp: employ bridge mechanism for display enable and disable")
> Signed-off-by: Abhinav Kumar <quic_abhinavk@...cinc.com>
> @@ -466,6 +466,8 @@ static int dp_display_notify_disconnect(struct device *dev)
> {
> struct dp_display_private *dp = dev_get_dp_display_private(dev);
>
> + dp->dp_display.link_ready = false;
As I also pointed out in the other thread, setting link_ready to false
here means that any spurious connect event (during physical disconnect)
will always be processed, something which can currently lead to a leaked
runtime pm reference.
Wasting some power is of course preferred over crashing the machine, but
please take it into consideration anyway.
Especially if your intention with this patch was to address the resets
we saw with sc8280xp which are gone since the HPD notify revert (which
fixed the hotplug detect issue that left the bridge in a
half-initialised state).
> +
> dp_add_event(dp, EV_USER_NOTIFICATION, false, 0);
>
> return 0;
Johan
Powered by blists - more mailing lists