lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_P0QUmKNHyrs5qBkE+EWh1-i5U0+vNRqvwPwqqQPfNqZw@mail.gmail.com>
Date: Fri, 23 Jan 2026 14:41:34 -0500
From: Alex Deucher <alexdeucher@...il.com>
To: Hamza Mahfooz <someguy@...ective-light.com>
Cc: Christian König <christian.koenig@....com>, 
	dri-devel@...ts.freedesktop.org, Alex Deucher <alexander.deucher@....com>, 
	David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>, 
	Harry Wentland <harry.wentland@....com>, Leo Li <sunpeng.li@....com>, 
	Rodrigo Siqueira <siqueira@...lia.com>, Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>, 
	Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, 
	Sunil Khatri <sunil.khatri@....com>, Ce Sun <cesun102@....com>, Lijo Lazar <lijo.lazar@....com>, 
	Kenneth Feng <kenneth.feng@....com>, Ivan Lipski <ivan.lipski@....com>, 
	Alex Hung <alex.hung@....com>, Tom Chung <chiahsuan.chung@....com>, 
	Melissa Wen <mwen@...lia.com>, Michel Dänzer <mdaenzer@...hat.com>, 
	Fangzhi Zuo <Jerry.Zuo@....com>, Timur Kristóf <timur.kristof@...il.com>, 
	amd-gfx@...ts.freedesktop.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 1/2] drm: introduce page_flip_timeout()

On Fri, Jan 23, 2026 at 9:52 AM Hamza Mahfooz
<someguy@...ective-light.com> wrote:
>
> On Fri, Jan 23, 2026 at 02:52:44PM +0100, Christian König wrote:
> > I can only see two reasons why you could run into a timeout:
> >
> > 1. A dma_fence never signals.
> >       How that should be handled is already well documented and doesn't require any of this.
> >
> > 2. A coding error in the vblank or page flip handler leading to waiting forever.
> >       In that case calling back into the driver doesn't help either.
> >
> > So as far as I can see the whole approach doesn't make any sense at all.
>
> It appears that resetting display firmware is able to put at least a
> subset of these systems back into a consistent (usable) state. Though, I
> don't have a reliable way to reproduce the issue that I'm seeing so I
> can't say for sure what it boils down to.

I'm not at all an expert on KMS, but I took a quick look at the in and
out fences in KMS, and I think I know what might be going on.  The out
fence is signalled by calling drm_crtc_send_vblank_event() from the
interrupt handler for the vblank/pageflip interrupt.  If that
interrupt gets missed somehow, that never gets called and userspace
will wait forever.  As a safety measure, maybe add a worker thread
that gets scheduled when the atomic commit happens and then in the
interrupt handler we cancel the worker.  If the interrupt never
happens, the worker will eventually run and call
drm_crtc_send_vblank_event() and get things unstuck.

Alex

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ