[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CADnq5_OXUKj=bfK0NOAhOzmhYCSnQXbxHbwLOaBQ6wFX033Wgw@mail.gmail.com>
Date: Wed, 19 Jun 2024 09:50:05 -0400
From: Alex Deucher <alexdeucher@...il.com>
To: Doug Anderson <dianders@...omium.org>
Cc: dri-devel@...ts.freedesktop.org, Maxime Ripard <mripard@...nel.org>,
Alex Deucher <alexander.deucher@....com>, Christian König <christian.koenig@....com>,
Xinhui Pan <Xinhui.Pan@....com>, André Almeida <andrealmeid@...lia.com>,
Aurabindo Pillai <aurabindo.pillai@....com>, Candice Li <candice.li@....com>,
Daniel Vetter <daniel@...ll.ch>, David Airlie <airlied@...il.com>,
Hamza Mahfooz <hamza.mahfooz@....com>, Hawking Zhang <Hawking.Zhang@....com>, Le Ma <le.ma@....com>,
Lijo Lazar <lijo.lazar@....com>, Ma Jun <Jun.Ma2@....com>,
Mario Limonciello <mario.limonciello@....com>, Shashank Sharma <shashank.sharma@....com>,
Srinivasan Shanmugam <srinivasan.shanmugam@....com>, Thomas Zimmermann <tzimmermann@...e.de>,
Victor Lu <victorchengchi.lu@....com>, amd-gfx@...ts.freedesktop.org,
chenxuebing <chenxb_99091@....com>, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2 8/8] drm/amdgpu: Call drm_atomic_helper_shutdown() at
shutdown time
On Tue, Jun 18, 2024 at 7:53 PM Doug Anderson <dianders@...omium.org> wrote:
>
> Hi,
>
> On Tue, Jun 18, 2024 at 3:00 PM Alex Deucher <alexdeucher@...il.com> wrote:
> >
> > On Tue, Jun 18, 2024 at 5:40 PM Doug Anderson <dianders@...omium.org> wrote:
> > >
> > > Hi,
> > >
> > >
> > > On Mon, Jun 17, 2024 at 8:01 AM Alex Deucher <alexdeucher@...il.com> wrote:
> > > >
> > > > On Wed, Jun 12, 2024 at 6:37 PM Douglas Anderson <dianders@...omium.org> wrote:
> > > > >
> > > > > Based on grepping through the source code this driver appears to be
> > > > > missing a call to drm_atomic_helper_shutdown() at system shutdown
> > > > > time. Among other things, this means that if a panel is in use that it
> > > > > won't be cleanly powered off at system shutdown time.
> > > > >
> > > > > The fact that we should call drm_atomic_helper_shutdown() in the case
> > > > > of OS shutdown/restart comes straight out of the kernel doc "driver
> > > > > instance overview" in drm_drv.c.
> > > > >
> > > > > Suggested-by: Maxime Ripard <mripard@...nel.org>
> > > > > Cc: Alex Deucher <alexander.deucher@....com>
> > > > > Cc: Christian König <christian.koenig@....com>
> > > > > Cc: Xinhui Pan <Xinhui.Pan@....com>
> > > > > Signed-off-by: Douglas Anderson <dianders@...omium.org>
> > > > > ---
> > > > > This commit is only compile-time tested.
> > > > >
> > > > > ...and further, I'd say that this patch is more of a plea for help
> > > > > than a patch I think is actually right. I'm _fairly_ certain that
> > > > > drm/amdgpu needs this call at shutdown time but the logic is a bit
> > > > > hard for me to follow. I'd appreciate if anyone who actually knows
> > > > > what this should look like could illuminate me, or perhaps even just
> > > > > post a patch themselves!
> > > >
> > > > I'm not sure this patch makes sense or not. The driver doesn't really
> > > > do a formal tear down in its shutdown routine, it just quiesces the
> > > > hardware. What are the actual requirements of the shutdown function?
> > > > In the past when we did a full driver tear down in shutdown, it
> > > > delayed the shutdown sequence and users complained.
> > >
> > > The "inspiration" for this patch is to handle panels properly.
> > > Specifically, panels often have several power/enable signals going to
> > > them and often have requirements that these signals are powered off in
> > > the proper order with the proper delays between them. While we can't
> > > always do so when the system crashes / reboots in an uncontrolled way,
> > > panel manufacturers / HW Engineers get upset if we don't power things
> > > off properly during an orderly shutdown/reboot. When panels are
> > > powered off badly it can cause garbage on the screen and, so I've been
> > > told, can even cause long term damage to the panels over time.
> > >
> > > In Linux, some panel drivers have tried to ensure a proper poweroff of
> > > the panel by handling the shutdown() call themselves. However, this is
> > > ugly and panel maintainers want panel drivers to stop doing it. We
> > > have removed the code doing this from most panels now [1]. Instead the
> > > assumption is that the DRM modeset drivers should be calling
> > > drm_atomic_helper_shutdown() which will make sure panels get an
> > > orderly shutdown.
> > >
> > > For a lot more details, see the cover letter [2] which then contains
> > > links to even more discussions about the topic.
> > >
> > > [1] https://lore.kernel.org/r/20240605002401.2848541-1-dianders@chromium.org
> > > [2] https://lore.kernel.org/r/20240612222435.3188234-1-dianders@chromium.org
> >
> > I don't think it's an issue. We quiesce the hardware as if we were
> > about to suspend the system (e.g., S3). For the display hardware we
> > call drm_atomic_helper_suspend() as part of that sequence.
>
> OK. It's no skin off my teeth and we can drop this patch if you're
> convinced it's not needed. From the point of view of someone who has
> no experience with this driver it seems weird to me that it would use
> drm_atomic_helper_suspend() at shutdown time instead of the documented
> drm_atomic_helper_shutdown(), but if it works for everyone then I'm
> not gonna complain.
I think the problem is that it is not clear exactly what the
expectations are around the PCI shutdown callback. The documentation
says:
"Hook into reboot_notifier_list (kernel/sys.c). Intended to stop any
idling DMA operations. Useful for enabling wake-on-lan (NIC) or
changing the power state of a device before reboot. e.g.
drivers/net/e100.c."
We tried a full driver teardown in the shutdown callback and it added
a lot of latency that really wasn't needed since the system was just
going into a reboot or power down. The best middle ground was to just
leverage our hw level suspend code to quiesce the hardware. Adding
complexity to call drm_atomic_helper_suspend() vs
drm_atomic_helper_shutdown() doesn't seem worth it since the functions
do pretty much the same thing (both call
drm_atomic_helper_disable_all()). Maybe it's better to update the
documentation to recommend drm_atomic_helper_suspend() if drivers want
to leverage their suspend code?
Alex
Powered by blists - more mailing lists