lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Mon, 19 Feb 2024 11:41:41 +0100
From: Johan Hovold <johan@...nel.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>,
	Rob Clark <robdclark@...il.com>,
	Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
	Kuogee Hsieh <quic_khsieh@...cinc.com>
Cc: Sean Paul <sean@...rly.run>,
	Marijn Suijten <marijn.suijten@...ainline.org>,
	David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>,
	Bjorn Andersson <quic_bjorande@...cinc.com>,
	quic_jesszhan@...cinc.com, quic_sbillaka@...cinc.com,
	dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
	linux-arm-msm@...r.kernel.org, regressions@...ts.linux.dev,
	linux-kernel@...r.kernel.org
Subject: drm/msm: Second DisplayPort regression in 6.8-rc1

On Sat, Feb 17, 2024 at 04:14:58PM +0100, Johan Hovold wrote:
> On Wed, Feb 14, 2024 at 02:52:06PM +0100, Johan Hovold wrote:
> > On Tue, Feb 13, 2024 at 10:00:13AM -0800, Abhinav Kumar wrote:

> Since Dmitry had trouble reproducing this issue I took a closer look at
> the DRM aux bridge series that Abhinav pointed and was able to track
> down the bridge regressions and come up with a reproducer. I just posted
> a series fixing this here:
> 
> 	https://lore.kernel.org/lkml/20240217150228.5788-1-johan+linaro@kernel.org/
> 
> As I mentioned in the cover letter, I am still seeing intermittent hard
> resets around the time that the DRM subsystem is initialising, which
> suggests that we may be dealing with two separate DRM regressions here
> however.
> 
> If the hard resets are triggered by something like unclocked hardware,
> perhaps that bit could this be related to the runtime PM rework?

It seems my initial suspicion that at least some of these regressions
were related to the runtime PM work was correct. The hard resets happens
when the DP controller is runtime suspended after being probed:

[   16.748475] bus: 'platform': __driver_probe_device: matched device ae00000.display-subsystem with driver msm-mdss
[   16.759444] msm-mdss ae00000.display-subsystem: Adding to iommu group 21
[   16.795226] bus: 'platform': __driver_probe_device: matched device ae01000.display-controller with driver msm_dpu
[   16.807542] probe of ae01000.display-controller returned -517 after 3 usecs
[   16.821552] bus: 'platform': __driver_probe_device: matched device ae90000.displayport-controller with driver msm-dp-display
[   16.837749] probe of ae90000.displayport-controller returned -517 after 1 usecs
[  OK  ] Listening on Load/Save RF Kill Swit[   16.854659] bus: 'platform': __dch Status /dev/rfkill Watch.
[   16.868458] probe of ae98000.displayport-controller returned -517 after 2 usecs
[   16.880012] bus: 'platform': __driver_probe_device: matched device aea0000.displayport-controller with driver msm-dp-display
[   16.891856] probe of aea0000.displayport-controller returned -517 after 2 usecs
[   16.903825] probe of ae00000.display-subsystem returned 0 after 144497 usecs
[   16.911636] bus: 'platform': __driver_probe_device: matched device ae01000.display-controller with driver msm_dpu
[   16.942092] probe of ae01000.display-controller returned 0 after 19593 usecs
         Starting Load/Save Screen Backligh…rightness[   16.959146] bus: 'platform': _ of backlight:backlight...
[   16.995355] msm-dp-display ae90000.displayport-controller: dp_display_probe - probe tail
[   17.004032] probe of ae90000.displayport-controller returned 0 after 30225 usecs
[   17.012308] bus: 'platform': __driver_probe_device: matched device ae98000.displayport-controller with driver msm-dp-display
[   17.050193] msm-dp-display ae98000.displayport-controller: dp_display_probe - probe tail
         Starting Network Name Resolution...
[   17.058925] probe of ae98000.displayport-controller returned 0 after 34774 usecs
[   17.074925] bus: 'platform': __driver_probe_device: matched device aea0000.displayport-controller with driver msm-dp-display
[        Starting Network Time Synchronization...
[   17.112000] msm-dp-display aea0000.displayport-controller: dp_display_probe - populate aux bus
[   17.125208] msm-dp-display aea0000.displayport-controller: dp_pm_runtime_resume
         Starting Record System Boot/Shutdown in UTMP...
         Starting Virtual Console Setup...
[  OK  ] Finished Load/Save Screen Backlight Brightness of backlight:backlight.
[   17.197909] msm-dp-display aea0000.displayport-controller: dp_pm_runtime_suspend
[   17.198079] probe of aea0Format: Log Type - Time(microsec) - Message - Optional Info
Log Type: B - Since Boot(Power On Reset),  D - Delta,  S - Statistic
S - QC_IMAGE_VERSION_STRING=BOOT.MXF.1.1-00470-MAKENA-1
S - IMAGE_VARIANT_STRING=SocMakenaWP
S - OEM_IMAGE_VERSION_STRING=crm-ubuntu92

  < machine is reset by hypervisor >

Presumably the reset happens when controller is being shut down while
still being used by the EFI framebuffer.

In the cases where the machines survives boot, the controller is never
suspended.

When investigating this I've also seen intermittent:

	[drm:dp_display_probe [msm]] *ERROR* device tree parsing failed

which also appears to be related to the runtime PM rework:

	https://lore.kernel.org/lkml/1701472789-25951-1-git-send-email-quic_khsieh@quicinc.com/

I believe this is enough evidence to conclude that this second
regression is introduced by commit 5814b8bf086a ("drm/msm/dp:
incorporate pm_runtime framework into DP driver"):

#regzbot introduced: 5814b8bf086a

Has anyone given some thought to how the framebuffer handover is
supposed to work? It seems we're currently just relying on luck with
timing.

Johan

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ