[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <4e5a938cb36f075596836aea98b54ae44a65c99d.camel@gmail.com>
Date: Wed, 07 May 2025 15:15:12 +0100
From: Vitor Soares <ivitro@...il.com>
To: Tomi Valkeinen <tomi.valkeinen@...asonboard.com>
Cc: Vitor Soares <vitor.soares@...adex.com>,
dri-devel@...ts.freedesktop.org, linux-kernel@...r.kernel.org, Aradhya
Bhatia <aradhya.bhatia@...ux.dev>, Jayesh Choudhary <j-choudhary@...com>,
stable@...r.kernel.org, Andrzej Hajda <andrzej.hajda@...el.com>, Neil
Armstrong <neil.armstrong@...aro.org>, Robert Foss <rfoss@...nel.org>,
Laurent Pinchart <Laurent.pinchart@...asonboard.com>, Jonas Karlman
<jonas@...boo.se>, Jernej Skrabec <jernej.skrabec@...il.com>, Maarten
Lankhorst <maarten.lankhorst@...ux.intel.com>, Maxime Ripard
<mripard@...nel.org>, Thomas Zimmermann <tzimmermann@...e.de>, David Airlie
<airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH v1] drm/bridge: cdns-dsi: Replace deprecated
UNIVERSAL_DEV_PM_OPS()
On Mon, 2025-05-05 at 21:03 +0300, Tomi Valkeinen wrote:
Hello,
> Hi,
>
> On 05/05/2025 20:47, Vitor Soares wrote:
> > On Mon, 2025-05-05 at 18:30 +0300, Tomi Valkeinen wrote:
> > > Hi,
> > >
> > > On 05/05/2025 17:45, Vitor Soares wrote:
> > > > On Tue, 2025-04-29 at 09:32 +0300, Tomi Valkeinen wrote:
> > > > > Hi,
> > > > >
> > > > > On 28/04/2025 12:40, Vitor Soares wrote:
> > > > > > From: Vitor Soares <vitor.soares@...adex.com>
> > > > > >
> > > > > > The deprecated UNIVERSAL_DEV_PM_OPS() macro uses the provided
> > > > > > callbacks
> > > > > > for both runtime PM and system sleep. This causes the DSI clocks to
> > > > > > be
> > > > > > disabled twice: once during runtime suspend and again during system
> > > > > > suspend, resulting in a WARN message from the clock framework when
> > > > > > attempting to disable already-disabled clocks.
> > > > > >
> > > > > > [ 84.384540] clk:231:5 already disabled
> > > > > > [ 84.388314] WARNING: CPU: 2 PID: 531 at /drivers/clk/clk.c:1181
> > > > > > clk_core_disable+0xa4/0xac
> > > > > > ...
> > > > > > [ 84.579183] Call trace:
> > > > > > [ 84.581624] clk_core_disable+0xa4/0xac
> > > > > > [ 84.585457] clk_disable+0x30/0x4c
> > > > > > [ 84.588857] cdns_dsi_suspend+0x20/0x58 [cdns_dsi]
> > > > > > [ 84.593651] pm_generic_suspend+0x2c/0x44
> > > > > > [ 84.597661] ti_sci_pd_suspend+0xbc/0x15c
> > > > > > [ 84.601670] dpm_run_callback+0x8c/0x14c
> > > > > > [ 84.605588] __device_suspend+0x1a0/0x56c
> > > > > > [ 84.609594] dpm_suspend+0x17c/0x21c
> > > > > > [ 84.613165] dpm_suspend_start+0xa0/0xa8
> > > > > > [ 84.617083] suspend_devices_and_enter+0x12c/0x634
> > > > > > [ 84.621872] pm_suspend+0x1fc/0x368
> > > > > >
> > > > > > To address this issue, replace UNIVERSAL_DEV_PM_OPS() with
> > > > > > DEFINE_RUNTIME_DEV_PM_OPS(), which avoids redundant suspend/resume
> > > > > > calls
> > > > > > by checking if the device is already runtime suspended.
> > > > > >
> > > > > > Cc: <stable@...r.kernel.org> # 6.1.x
> > > > > > Fixes: e19233955d9e ("drm/bridge: Add Cadence DSI driver")
> > > > > > Signed-off-by: Vitor Soares <vitor.soares@...adex.com>
> > > > > > ---
> > > > > > drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c | 10 +++++-----
> > > > > > 1 file changed, 5 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
> > > > > > b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
> > > > > > index b022dd6e6b6e..62179e55e032 100644
> > > > > > --- a/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
> > > > > > +++ b/drivers/gpu/drm/bridge/cadence/cdns-dsi-core.c
> > > > > > @@ -1258,7 +1258,7 @@ static const struct mipi_dsi_host_ops
> > > > > > cdns_dsi_ops
> > > > > > = {
> > > > > > .transfer = cdns_dsi_transfer,
> > > > > > };
> > > > > >
> > > > > > -static int __maybe_unused cdns_dsi_resume(struct device *dev)
> > > > > > +static int cdns_dsi_resume(struct device *dev)
> > > > > > {
> > > > > > struct cdns_dsi *dsi = dev_get_drvdata(dev);
> > > > > >
> > > > > > @@ -1269,7 +1269,7 @@ static int __maybe_unused
> > > > > > cdns_dsi_resume(struct
> > > > > > device *dev)
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > -static int __maybe_unused cdns_dsi_suspend(struct device *dev)
> > > > > > +static int cdns_dsi_suspend(struct device *dev)
> > > > > > {
> > > > > > struct cdns_dsi *dsi = dev_get_drvdata(dev);
> > > > > >
> > > > > > @@ -1279,8 +1279,8 @@ static int __maybe_unused
> > > > > > cdns_dsi_suspend(struct
> > > > > > device *dev)
> > > > > > return 0;
> > > > > > }
> > > > > >
> > > > > > -static UNIVERSAL_DEV_PM_OPS(cdns_dsi_pm_ops, cdns_dsi_suspend,
> > > > > > cdns_dsi_resume,
> > > > > > - NULL);
> > > > > > +static DEFINE_RUNTIME_DEV_PM_OPS(cdns_dsi_pm_ops, cdns_dsi_suspend,
> > > > > > + cdns_dsi_resume, NULL);
> > > > >
> > > > > I'm not sure if this, or the UNIVERSAL_DEV_PM_OPS, is right here. When
> > > > > the system is suspended, the bridge drivers will get a call to the
> > > > > *_disable() hook, which then disables the device. If the bridge driver
> > > > > would additionally do something in its system suspend hook, it would
> > > > > conflict with normal disable path.
> > > > >
> > > > > I think bridges/panels should only deal with runtime PM.
> > > > >
> > > > > Tomi
> > > > >
> > > >
> > > > In the proposed change, we make use of pm_runtime_force_suspend() during
> > > > system-wide suspend. If the device is already suspended, this call is a
> > > > no-op and disables runtime PM to prevent spurious wakeups during the
> > > > suspend period. Otherwise, it triggers the device’s runtime_suspend()
> > > > callback.
> > > >
> > > > I briefly reviewed other bridge drivers, and those that implement
> > > > runtime
> > > > PM appear to follow a similar approach, relying solely on runtime PM
> > > > callbacks and using pm_runtime_force_suspend()/resume() to handle
> > > > system-wide transitions.
> > >
> > > Yes, I see such a solution in some of the bridge and panel drivers. I'm
> > > probably missing something here, as I don't think it's correct.
> > >
> > > Why do we need to set the system suspend/resume hooks? What is the
> > > scenario where those will be called, and the
> > > pm_runtime_force_suspend()/resume() do something that's not already done
> > > via the normal DRM pipeline enable/disable?
> > >
> > > Tomi
> > >
> >
> > I'm not a DRM expert, but my understanding is that there might be edge cases
> > where the system suspend sequence occurs without the DRM core properly
> > disabling
> > the bridge — for example, due to a bug or if the bridge is not bound to an
> > active pipeline. In such cases, having suspend/resume callbacks ensures that
> > the
> > device is still properly suspended and resumed.
> >
> > Additionally, pm_runtime_force_suspend() disables runtime PM for the device
> > during system suspend, preventing unintended wakeups (e.g., via IRQs,
> > delayed
> > work, or sysfs access) until pm_runtime_force_resume() is invoked.
> >
> > From my perspective, the use of pm_runtime_force_suspend() and
> > pm_runtime_force_resume() serves as a safety mechanism to guarantee a well-
> > defined and race-free state during system suspend.
>
> But then we must be sure that the suspend sequence is just right.
>
> At least in tidss's case, tidss_drv.c has tidss_suspend() which calls
> drm_mode_config_helper_suspend(), which, if I recall right, will then
> disable the pipeline. This must happen before the bridge's system
> suspend call, otherwise the bridge might go to suspend while the
> pipeline is still running, which might cause errors on the still-running
> pipeline entities, and probably crash the bridge's disable() call. If a
> bridge is a platform device, I don't think there's any ordering between
> the tidss's and the bridge's suspend calls.
>
> If the bridge is not bound to a pipeline, why would it be enabled in the
> first place?
>
> For the bug case... We're in random territory, then. If the driver is
> bugging, are you sure it's safe and useful to suspend it? Or would it be
> better to not do anything...
>
> I'm not nacking the patch, as this approach seems to be used in multiple
> drivers. It just rings multiple alarm bells here, and I don't understand
> how exactly it's supposed to work. That said, the driver is using
> UNIVERSAL_DEV_PM_OPS(), so I think switching to
> DEFINE_RUNTIME_DEV_PM_OPS() is at least not worse (well, I can't be
> quite sure even about that =).
>
> Tomi
>
I conducted further tests based on your concerns, specifically regarding the
suspend ordering between the tidss_suspend() and the bridge suspend. Here are my
observations:
- The bridge (controlled via TI SCI PD) suspends after tidss_suspend(), which
uses device-specific PM operations (platform_pm_suspend).
- I attempted to influence the probe/suspend order via DT node placement and
delays, but that had no effect on suspend sequencing.
- I added some debug prints and the pm_runtime_force_suspend() is invoked
before cdns_dsi_suspend(). However, I did not observe any misbehavior during
suspend/resume.
- I also tested with only runtime PM support in the driver (without
pm_runtime_force_suspend/resume()), and I couldn't detect any functional
difference nor the issue originally addressed in this patch.
Given that, I will send a v2 of the patch implementing only runtime PM support.
If issues arise in the future due to the lack of explicit
pm_runtime_force_suspend/resume() handling, we can revisit and address them at
that time with clearer justification.
Best regards,
Vitor Soares
Powered by blists - more mailing lists