lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <ZfFhXG5yd6O29spS@hovoldconsulting.com>
Date: Wed, 13 Mar 2024 09:18:36 +0100
From: Johan Hovold <johan@...nel.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc: freedreno@...ts.freedesktop.org, Rob Clark <robdclark@...il.com>,
	Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
	Sean Paul <sean@...rly.run>,
	Marijn Suijten <marijn.suijten@...ainline.org>,
	David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
	Kuogee Hsieh <quic_khsieh@...cinc.com>,
	dri-devel@...ts.freedesktop.org, swboyd@...omium.org,
	quic_jesszhan@...cinc.com, quic_parellan@...cinc.com,
	quic_bjorande@...cinc.com, Rob Clark <robdclark@...omium.org>,
	linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] drm/msm/dp: move link_ready out of HPD event thread

On Tue, Mar 12, 2024 at 10:39:46AM -0700, Abhinav Kumar wrote:
> On 3/12/2024 9:59 AM, Johan Hovold wrote:

> >> Heh. This is getting ridiculous. I just tried running with this patch
> >> and it again breaks hotplug detect in a VT console and in X (where I
> >> could enable a reconnected external display by running xrandr twice
> >> before).
> >>
> >> So, please, do not apply this one.
> > 
> > To make things worse, I indeed also hit the reset when disconnecting
> > after such a failed hotplug.

> Ack, I will hold off till I analyze your issues more which you have 
> listed in separate replies. Especially about the spurious connect, I 
> believe you are trying to mention that, by adding logs, you are able to 
> delay the processing of a connect event to *make* it like a spurious 
> one? In case, I got this part wrong, can you pls explain the spurious 
> connect scenario again?

No, I only mentioned the debug printks in passing as instrumentation
like that may affect race conditions (but I'm also hitting the resets
also with no printks in place).

The spurious connect event comes directly from the pmic firmware, and
even if we may optimise things by implementing some kind of debounce,
the hotplug implementation needs to be robust enough to not kill the
machine if such an event gets through.

Basically what I see is that during physical disconnect there can be
multiple hpd notify events (e.g. connect, disconnect, connect):

[  146.910195] usb 5-1: USB disconnect, device number 4
[  146.931026] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 1, status = 2
[  146.934785] msm-dp-display ae98000.displayport-controller: dp_hpd_unplug_handle
[  146.938114] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 1, status = 1
[  146.940245] [CONNECTOR:35:DP-2] status updated from disconnected to connected
[  146.955193] msm-dp-display ae98000.displayport-controller: dp_bridge_hpd_notify - link_ready = 0, status = 2

And it is the spurious connect event while the link is being tore down
that triggers the hotplug processing that leads to the reset.

Similarly, I've seen spurious disconnect events while the plug in being
inserted.

> A short response on why this change was made is that commit can be 
> issued by userspace or the fbdev client. So userspace involvement only 
> makes commit happen from a different path. It would be incorrect to 
> assume the issues from the earlier bug and the current one are different 
> only because there was userspace involvement in that one and not this.
>
> Because in the end, it manifests itself in the same way that 
> atomic_enable() did not go through after an atomic_disable() and the 
> next atomic_disable() crashes.

Right, but your proposed fix would not actually fix anything and judging
from the sparse commit message and diff itself it is clearly only meant
to mitigate the case where user space is involved, which is *not* the
case here.

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ