[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <2770547.lGaqSPkdTl@timur-hyperion>
Date: Wed, 28 Jan 2026 13:22:53 +0100
From: Timur Kristóf <timur.kristof@...il.com>
To: Michel Dänzer <michel.daenzer@...lbox.org>,
Hamza Mahfooz <someguy@...ective-light.com>, dri-devel@...ts.freedesktop.org,
Christian König <christian.koenig@....com>
Cc: Alex Deucher <alexander.deucher@....com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
Harry Wentland <harry.wentland@....com>, Leo Li <sunpeng.li@....com>,
Rodrigo Siqueira <siqueira@...lia.com>,
Maarten Lankhorst <maarten.lankhorst@...ux.intel.com>,
Maxime Ripard <mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>,
Sunil Khatri <sunil.khatri@....com>, Ce Sun <cesun102@....com>,
Lijo Lazar <lijo.lazar@....com>, Kenneth Feng <kenneth.feng@....com>,
Ivan Lipski <ivan.lipski@....com>, Alex Hung <alex.hung@....com>,
Tom Chung <chiahsuan.chung@....com>, Melissa Wen <mwen@...lia.com>,
Michel Dänzer <mdaenzer@...hat.com>,
Fangzhi Zuo <Jerry.Zuo@....com>, amd-gfx@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, Mario Limonciello <mario.limonciello@....com>
Subject: Re: [PATCH 1/2] drm: introduce page_flip_timeout()
On Wednesday, January 28, 2026 12:25:31 PM Central European Standard Time
Christian König wrote:
> On 1/28/26 10:19, Timur Kristóf wrote:
> > On 2026. január 26., hétfő 14:00:59 közép-európai téli idő Christian König
> >
> > wrote:
> >> On 1/26/26 11:27, Michel Dänzer wrote:
> >>> On 1/26/26 11:14, Christian König wrote:
> >>>> On 1/23/26 15:44, Timur Kristóf wrote:
> >>>>> On Friday, January 23, 2026 2:52:44 PM Central European Standard Time
> >>>>>
> >>>>> Christian König wrote:
> >>>>>> So as far as I can see the whole approach doesn't make any sense at
> >>>>>> all.
> >>>>>
> >>>>> Actually this approach was proposed as a solution at XDC 2025 in
> >>>>> Harry's
> >>>>> presentation, "DRM calls driver callback to attempt recovery", see
> >>>>> page
> >>>>> 9 in this slide deck:
> >>>>>
> >>>>> https://indico.freedesktop.org/event/10/contributions/431/attachments/
> >>>>> 267/355/2025%20XDC%20Hackfest%20Update%20v1.2.pdf
> >>>>>
> >>>>> If you disagree with Harry, please make a counter-proposal.
> >>>>
> >>>> Well I must have missed that detail otherwise I would have objected.
> >>>>
> >>>> But looking at the slide Harry actually pointed out what immediately
> >>>> came
> >>>> to my mind as well, e.g. that the Compositor needs to issue a full
> >>>> modeset to re-program the CRTC.>
> >>>
> >>> In principle, the kernel driver has all the information it needs to
> >>> reprogram the HW by itself. Not sure why the compositor would need to be
> >>> actively involved.
> >>
> >> Well first of all I'm not sure if we can reprogram the HW even if all
> >> information are available.
> >>
> >> Please keep in mind that we are in a dma_fence timeout handler here with
> >> the usual rat tail of consequences. So no allocation of memory or taking
> >> locks under which memory is allocated or are part of preparing the page
> >> flip etc... I'm not so deep in the atomic code, so Alex, Sima and
> >> probably you as well can answer that much better than I do, but of hand
> >> it sounds questionable.
> >>
> >> On the other hand we could of course postpone reprogramming the CRTC into
> >> an async work item, but that might created more problems then it solves.
> >>
> >> Then second even if the kernel can do it I'm not sure if it should do it.
> >>
> >> I mean userspace asked for a quick page flip and not some expensive
> >> CRTC/PLL reprogramming. Stuff like that usually takes some time and by
> >> then the frame which should be displayed by the page flip might already
> >> be stale and it would be better to tell userspace that we couldn't
> >> display it on time and wait for a new frame to be generated.
> >
> > I agree with Michel here. It's a kernel bug, it should be solved by the
> > kernel. I don't like the tendency of pushing userspace to handle kernel
> > bugs, especially if this is just needed for one vendor's buggy driver.
> > (No offence.)
> Well I strongly disagree. The kernel is not here to serve userspace, but to
> give userspace access to the HW in a generalized manner.
Isn't this why kernel mode setting was invented in favour of the mess that we
used to have in the DDX drivers?
> If this is caused by a HW failure then reporting back to userspace is the
> most reasonable thing to do.
Nothing wrong with reporting the problem back to userspace. But it isn't worth
much, because userspace is extremely unlikely to be able to fix it. How would
userspace fix a missed or broken interrupt, a firmware hang, or buggy
programming of display engine registers?
Also, even if it were possible, expecting userspace to fix it would just place
extra burden on compositor maintainers, which in turn would put us in a
similar situation where were with GPU recovery before queue reset was
implemented. Only a small handful of compositors can handle it (only one of
the major players and maybe a few smaller ones). That gives all other users a
bad experience by default.
Powered by blists - more mailing lists