lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Date: Tue, 27 Feb 2024 14:33:48 +0100
From: Johan Hovold <johan@...nel.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>,
	Rob Clark <robdclark@...il.com>,
	Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
	Kuogee Hsieh <quic_khsieh@...cinc.com>
Cc: Daniel Thompson <daniel.thompson@...aro.org>,
	Sean Paul <sean@...rly.run>,
	Marijn Suijten <marijn.suijten@...ainline.org>,
	David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
	Bjorn Andersson <quic_bjorande@...cinc.com>,
	quic_jesszhan@...cinc.com, quic_sbillaka@...cinc.com,
	dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
	linux-arm-msm@...r.kernel.org, regressions@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: drm/msm: DisplayPort hard-reset on hotplug regression in 6.8-rc1

Hi,

Since 6.8-rc1 I have seen (and received reports) of hard resets of the
Lenovo ThinkPad X13s after connecting and disconnecting an external
display.

I have triggered this on a simple disconnect while in a VT console, but
also when stopping Xorg after having repeatedly connected and
disconnected an external display and tried to enable it using xrandr.

In the former case, the last (custom debug) messages printed over an SSH
session were once:

    [  948.416358] usb 5-1: USB disconnect, device number 3
    [  948.443496] msm_dpu ae01000.display-controller: msm_fbdev_client_hotplug
    [  948.443723] msm-dp-display ae98000.displayport-controller: dp_power_clk_enable - type = 1, enable = 0
    [  948.443872] msm-dp-display ae98000.displayport-controller: dp_ctrl_phy_exit
    [  948.445117] msm-dp-display ae98000.displayport-controller: dp_ctrl_phy_exit - done
    
and then the hypervisor resets the machine.

Hotplug in Xorg seems to work worse than it did with 6.7, which also had
some issues. Connecting a display once seems to work fine, but trying to
re-enable a reconnected display using xrandr sometimes does not work at
all, while with 6.7 it usually worked on the second xrandr execution.

xrandr reports the reconnected display as disconnected:

    Screen 0: minimum 320 x 200, current 1920 x 1200, maximum 5120 x 4096
    eDP-1 connected primary 1920x1200+0+0 (normal left inverted right x axis y axis) 286mm x 178mm
       1920x1200     60.03*+
       1600x1200     60.00  
    DP-1 disconnected (normal left inverted right x axis y axis)
    DP-2 disconnected 1920x1200+0+0 (normal left inverted right x axis y axis) 0mm x 0mm
      1920x1200 (0x40c) 154.000MHz +HSync -VSync
            h: width  1920 start 1968 end 2000 total 2080 skew    0 clock  74.04KHz
            v: height 1200 start 1203 end 1209 total 1235           clock  59.95Hz

Running 'xrandr --output DP-2 --auto' 2-3 times makes xrandr report the
display as connected, but the display is still blank (unlike with 6.7).

A few times after having exercised hotplug this way, the machine hard
resets when Xorg is later stopped. Once I saw the following log messages
on an SSH session but they may not have been printed directly before
the hard reset:

    [  214.555781] [drm:dpu_encoder_phys_vid_wait_for_commit_done:492] [dpu error]vblank timeout
    [  214.555843] [drm:dpu_kms_wait_for_commit_done:483] [dpu error]wait for commit done returned -110

Note that this appears to be unrelated to the recently fixed Qualcomm
power domain driver bug which can trigger similar resets when
initialising the display subsystem on boot. Specifically, I have
triggered the hotplug resets described above also with the fix applied.
[1]

Reverting commit e467e0bde881 ("drm/msm/dp: use drm_bridge_hpd_notify()
to report HPD status changes") which fixes the related VT console
regression does not seem to make any difference. [2]

Daniel Thompson reports that reverting the whole runtime PM series
appears to make the hard resets he has seen with DisplayPort hotplug go
away however:

	https://lore.kernel.org/lkml/1701472789-25951-1-git-send-email-quic_khsieh@quicinc.com/

So for now, let's assume that these regressions were also introduced (or
triggered) by commit 5814b8bf086a ("drm/msm/dp: incorporate pm_runtime
framework into DP driver").

Johan


[1] https://lore.kernel.org/lkml/20240226-rpmhpd-enable-corner-fix-v1-1-68c004cec48c@quicinc.com/
[2] https://lore.kernel.org/lkml/Zd3YPGmrprxv-N-O@hovoldconsulting.com/


#regzbot introduced: 5814b8bf086a

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ