[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aWSv+kVV3G18I/NJ@yuanjiey.ap.qualcomm.com>
Date: Mon, 12 Jan 2026 16:25:30 +0800
From: yuanjiey <yuanjie.yang@....qualcomm.com>
To: Dmitry Baryshkov <dmitry.baryshkov@....qualcomm.com>
Cc: robin.clark@....qualcomm.com, lumag@...nel.org, abhinav.kumar@...ux.dev,
jesszhan0024@...il.com, sean@...rly.run, marijn.suijten@...ainline.org,
airlied@...il.com, simona@...ll.ch, krzysztof.kozlowski@...aro.org,
konrad.dybcio@....qualcomm.com, linux-arm-msm@...r.kernel.org,
dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, tingwei.zhang@....qualcomm.com,
aiqun.yu@....qualcomm.com, yongxing.mou@....qualcomm.com
Subject: Re: [PATCH 1/2] drm/msm/dpu: fix mismatch between power and frequency
On Mon, Jan 12, 2026 at 09:38:41AM +0200, Dmitry Baryshkov wrote:
> On Mon, 12 Jan 2026 at 08:23, yuanjiey <yuanjie.yang@....qualcomm.com> wrote:
> >
> > On Fri, Jan 09, 2026 at 05:22:37PM +0200, Dmitry Baryshkov wrote:
> > > On Fri, Jan 09, 2026 at 04:38:07PM +0800, yuanjie yang wrote:
> > > > From: Yuanjie Yang <yuanjie.yang@....qualcomm.com>
> > > >
> > > > During DPU runtime suspend, calling dev_pm_opp_set_rate(dev, 0) drops
> > > > the MMCX rail to MIN_SVS while the core clock frequency remains at its
> > > > original (highest) rate. When runtime resume re-enables the clock, this
> > > > may result in a mismatch between the rail voltage and the clock rate.
> > > >
> > > > For example, in the DPU bind path, the sequence could be:
> > > > cpu0: dev_sync_state -> rpmhpd_sync_state
> > > > cpu1: dpu_kms_hw_init
> > > > timeline 0 ------------------------------------------------> t
> > > >
> > > > After rpmhpd_sync_state, the voltage performance is no longer guaranteed
> > > > to stay at the highest level. During dpu_kms_hw_init, calling
> > > > dev_pm_opp_set_rate(dev, 0) drops the voltage, causing the MMCX rail to
> > > > fall to MIN_SVS while the core clock is still at its maximum frequency.
> > >
> > > Ah, I see. dev_pm_set_rate(0) transforms to _disable_opp_table(), which
> > > doesn't do anything with clocks. I think we should have a call to
> > > clk_disable_unprepare() before that and clk_prepare_enable() in the
> > > resume path.
> >
> > I do check the backtrace on kaanapali:
> >
> > active_corner = 3 (MIN_SVS)
> > rpmhpd_aggregate_corner+0x168/0x1d0
> > rpmhpd_set_performance_state+0x84/0xd0
> > _genpd_set_performance_state+0x50/0x198
> > genpd_set_performance_state.isra.0+0xbc/0xdc
> > genpd_dev_pm_set_performance_state+0x60/0xc4
> > dev_pm_domain_set_performance_state+0x24/0x3c
> > _set_opp+0x370/0x584
> > dev_pm_opp_set_rate+0x258/0x2b4
> > dpu_runtime_suspend+0x50/0x11c [msm]
> > pm_generic_runtime_suspend+0x2c/0x44
> > genpd_runtime_suspend+0xac/0x2d0
> > __rpm_callback+0x48/0x19c
> > rpm_callback+0x74/0x80
> > rpm_suspend+0x108/0x588
> > rpm_idle+0xe8/0x190
> > __pm_runtime_idle+0x50/0x94
> > dpu_kms_hw_init+0x3a0/0x4a8
> >
> > dev_pm_opp_set_rate(dev, 0); just low power to MIN_SVS.
> > And freq MDP: 650MHz
> >
> >
> > And I try change the order:
> > from:
> > dev_pm_opp_set_rate(dev, 0);
> > clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > to:
> > clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > dev_pm_opp_set_rate(dev, 0);
> >
> > But the issue is still here.
>
> But which clock is still demanding higher MMCX voltage? All DPU clocks
> should be turned off at this point.
Yes, no DPU clock demand higher MMCX voltage here. But next time pm_runtime_get_sync
need higher MMCX voltagei due to high freq.
> >
> >
> > > > When the power is re-enabled, only the clock is enabled, leading to a
> > > > situation where the MMCX rail is at MIN_SVS but the core clock is at its
> > > > highest rate. In this state, the rail cannot sustain the clock rate,
> > > > which may cause instability or system crash.
> > > >
> > > > Fix this by setting the corresponding OPP corner during both power-on
> > > > and power-off sequences to ensure proper alignment of rail voltage and
> > > > clock frequency.
> > > >
> > > > Fixes: b0530eb11913 ("drm/msm/dpu: Use OPP API to set clk/perf state")
> > > >
> > > > Signed-off-by: Yuanjie Yang <yuanjie.yang@....qualcomm.com>
> > >
> > > No empty lines between the tags. Also please cc stable.
> >
> > Sure, will fix.
> >
> > > > ---
> > > > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c | 16 ++++++++++++----
> > > > drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h | 3 +++
> > > > 2 files changed, 15 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > index 0623f1dbed97..c31488335f2b 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.c
> > > > @@ -1306,9 +1306,14 @@ static int dpu_kms_init(struct drm_device *ddev)
> > > > struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > > > struct dev_pm_opp *opp;
> > > > int ret = 0;
> > > > - unsigned long max_freq = ULONG_MAX;
> > > > + dpu_kms->max_freq = ULONG_MAX;
> > > > + dpu_kms->min_freq = 0;
> > > >
> > > > - opp = dev_pm_opp_find_freq_floor(dev, &max_freq);
> > > > + opp = dev_pm_opp_find_freq_floor(dev, &dpu_kms->max_freq);
> > > > + if (!IS_ERR(opp))
> > > > + dev_pm_opp_put(opp);
> > > > +
> > > > + opp = dev_pm_opp_find_freq_ceil(dev, &dpu_kms->min_freq);
> > > > if (!IS_ERR(opp))
> > > > dev_pm_opp_put(opp);
> > > >
> > > > @@ -1461,8 +1466,8 @@ static int __maybe_unused dpu_runtime_suspend(struct device *dev)
> > > > struct msm_drm_private *priv = platform_get_drvdata(pdev);
> > > > struct dpu_kms *dpu_kms = to_dpu_kms(priv->kms);
> > > >
> > > > - /* Drop the performance state vote */
> > > > - dev_pm_opp_set_rate(dev, 0);
> > > > + /* adjust the performance state vote to low performance state */
> > > > + dev_pm_opp_set_rate(dev, dpu_kms->min_freq);
> > >
> > > Here min_freq is the minumum working frequency, which will keep it
> > > ticking at a high frequency. I think we are supposed to turn it off
> > > (well, switch to XO). Would it be enough to swap these two lines
> > > instead?
> >
> > Yes, just clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks) to
> > disable clk is OK.
> > we can drop change here.
> >
> > > > clk_bulk_disable_unprepare(dpu_kms->num_clocks, dpu_kms->clocks);
> > > >
> > > > for (i = 0; i < dpu_kms->num_paths; i++)
> > > > @@ -1481,6 +1486,9 @@ static int __maybe_unused dpu_runtime_resume(struct device *dev)
> > > > struct drm_device *ddev;
> > > >
> > > > ddev = dpu_kms->dev;
> > > > + /* adjust the performance state vote to high performance state */
> > > > + if (dpu_kms->max_freq != ULONG_MAX)
> > > > + dev_pm_opp_set_rate(dev, dpu_kms->max_freq);
> > >
> > > This one should not be necessary, we should be setting the performance
> > > point while comitting the DRM state.
> >
> > No, issue is during dpu binding path. And in msm_drm_kms_init driver just
> > pm_runtime_resume_and_get enable power and pm_runtime_put_sync disable power.
> > But We do not set the appropriate OPP each time the power is enabled.
> > As a result, a situation may occur where the rail stays at MIN_SVS while the
> > MDP is running at a high frequency.
>
> Please move dev_pm_opp_set_rate() from dpu_kms_init() to dpu_kms_hw_init().
During dpu_kms_hw_init and msm_drm_kms_init and msm_drm_kms_post_init.
There are multiple places where pm_runtime_get_sync(pm_runtime_resume_and_get)and pm_runtime_put_sync are called.
And each time after pm_runtime_get_sync(pm_runtime_resume_and_get) will access register depond on MDP clk.
Do I need to add dev_pm_opp_set_rate after every pm_runtime_get_sync and pm_runtime_resume_and_get?
pm_runtime_get_sync
dev_pm_opp_set_rate
"access register"
pm_runtime_put_sync
pm_runtime_resume_and_get
dev_pm_opp_set_rate
"access register"
pm_runtime_put_sync
Thanks,
Yuanjie
> >
> > rpm_idle+0xe8/0x190
> > __pm_runtime_idle+0x50/0x94
> > dpu_kms_hw_init+0x3a0/0x4a8 [msm]
> > msm_drm_kms_init+0xb8/0x340 [msm]
> > msm_drm_init+0x16c/0x22c [msm]
> > msm_drm_bind+0x30/0x3c [msm]
> > try_to_bring_up_aggregate_device+0x168/0x1d4
> > __component_add+0xa4/0x170
> > component_add+0x14/0x20
> > dsi_dev_attach+0x20/0x2c [msm]
> > dsi_host_attach+0x58/0x98 [msm]
> > mipi_dsi_attach+0x30/0x54
> > novatek_nt37801_probe+0x13c/0x1c8 [panel_novatek_nt37801]
> >
> > And I found a a similar bug.
> > https://lore.kernel.org/all/20220915205559.14574-1-quic_bjorande@quicinc.com/
> >
> > Since the panel driver does not hold the property power-domains = <&rpmhpd RPMHPD_MMCX>
> > once all drivers that do own the RPMHPD_MMCX power domain finish probing,
> > the interconnect’s dev_sync_state is triggered, which eventually runs rpmhpd_sync_state
> > and starts dynamic voltage adjustment. This is when the issue occurs.
> >
> >
> > if do change below, this issue can also be fixed.
> > &mdss_dsi0 {
> > ...
> > panel@0 {
> > compatible = "novatek,nt37801";
> > ...
> > ++ power-domains = <&rpmhpd RPMHPD_MMCX>;
> > }
> > }
> > But I don't think panel driver should own a power-domains = <&rpmhpd RPMHPD_MMCX>;
>
> That's not related.
>
> >
> >
> >
> > Thanks,
> > Yuanjie
> >
> > > >
> > > > rc = clk_bulk_prepare_enable(dpu_kms->num_clocks, dpu_kms->clocks);
> > > > if (rc) {
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > index 993cf512f8c5..8d2595d8a5f6 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_kms.h
> > > > @@ -92,6 +92,9 @@ struct dpu_kms {
> > > > struct clk_bulk_data *clocks;
> > > > size_t num_clocks;
> > > >
> > > > + unsigned long max_freq;
> > > > + unsigned long min_freq;
> > > > +
> > > > /* reference count bandwidth requests, so we know when we can
> > > > * release bandwidth. Each atomic update increments, and frame-
> > > > * done event decrements. Additionally, for video mode, the
> > > > --
> > > > 2.34.1
> > > >
> > >
> > > --
> > > With best wishes
> > > Dmitry
>
>
>
> --
> With best wishes
> Dmitry
Powered by blists - more mailing lists