[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <eysj65p526ice7taxh5c7whvyb4o4rnxnydgs6wz5ccnjh5mlj@z6xqdbqqy3lw>
Date: Tue, 10 Dec 2024 01:51:11 +0200
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc: Rob Clark <robdclark@...il.com>, Sean Paul <sean@...rly.run>,
Marijn Suijten <marijn.suijten@...ainline.org>, David Airlie <airlied@...il.com>,
Simona Vetter <simona@...ll.ch>, Simona Vetter <simona.vetter@...ll.ch>,
linux-arm-msm@...r.kernel.org, dri-devel@...ts.freedesktop.org, freedreno@...ts.freedesktop.org,
linux-kernel@...r.kernel.org, stable@...r.kernel.org, Leonard Lausen <leonard@...sen.nl>,
György Kurucz <me@...uczgy.com>, Johan Hovold <johan+linaro@...nel.org>
Subject: Re: [PATCH v4] drm/msm/dpu1: don't choke on disabling the writeback
connector
On Mon, Dec 09, 2024 at 03:36:29PM -0800, Abhinav Kumar wrote:
>
>
> On 12/9/2024 1:55 PM, Dmitry Baryshkov wrote:
> > On Mon, 9 Dec 2024 at 21:54, Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
> > >
> > >
> > >
> > > On 12/9/2024 2:04 AM, Dmitry Baryshkov wrote:
> > > > During suspend/resume process all connectors are explicitly disabled and
> > > > then reenabled. However resume fails because of the connector_status check:
> > > >
> > > > [dpu error]connector not connected 3
> > > > [drm:drm_mode_config_helper_resume [drm_kms_helper]] *ERROR* Failed to resume (-22)
> > > >
> > > > It doesn't make sense to check for the Writeback connected status (and
> > > > other drivers don't perform such check), so drop the check.
> > > >
> > > > It wasn't a problem before the commit 71174f362d67 ("drm/msm/dpu: move
> > > > writeback's atomic_check to dpu_writeback.c"), since encoder's
> > > > atomic_check() is called under a different conditions that the
> > > > connector's atomic_check() (e.g. it is not called if there is no
> > > > connected CRTC or if the corresponding connector is not a part of the
> > > > new state).
> > > >
> > > > Fixes: 71174f362d67 ("drm/msm/dpu: move writeback's atomic_check to dpu_writeback.c")
> > > > Cc: stable@...r.kernel.org
> > > > Reported-by: Leonard Lausen <leonard@...sen.nl>
> > > > Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/57
> > > > Tested-by: Leonard Lausen <leonard@...sen.nl> # on sc7180 lazor
> > > > Reported-by: György Kurucz <me@...uczgy.com>
> > > > Link: https://lore.kernel.org/all/b70a4d1d-f98f-4169-942c-cb9006a42b40@kuruczgy.com/
> > > > Reported-by: Johan Hovold <johan+linaro@...nel.org>
> > > > Link: https://lore.kernel.org/all/ZzyYI8KkWK36FfXf@hovoldconsulting.com/
> > > > Tested-by: György Kurucz <me@...uczgy.com>
> > > > Reviewed-by: Johan Hovold <johan+linaro@...nel.org>
> > > > Tested-by: Johan Hovold <johan+linaro@...nel.org>
> > > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
> > > > ---
> > > > Leonard Lausen reported an issue with suspend/resume of the sc7180
> > > > devices. Fix the WB atomic check, which caused the issue.
> > > > ---
> > > > Changes in v4:
> > > > - Epanded commit message (Johan)
> > > > - Link to v3: https://lore.kernel.org/r/20241208-dpu-fix-wb-v3-1-a1de69ce4a1b@linaro.org
> > > >
> > > > Changes in v3:
> > > > - Rebased on top of msm-fixes
> > > > - Link to v2: https://lore.kernel.org/r/20240802-dpu-fix-wb-v2-0-7eac9eb8e895@linaro.org
> > > >
> > > > Changes in v2:
> > > > - Reworked the writeback to just drop the connector->status check.
> > > > - Expanded commit message for the debugging patch.
> > > > - Link to v1: https://lore.kernel.org/r/20240709-dpu-fix-wb-v1-0-448348bfd4cb@linaro.org
> > > > ---
> > > > drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c | 3 ---
> > > > 1 file changed, 3 deletions(-)
> > > >
> > > > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
> > > > index 16f144cbc0c986ee266412223d9e605b01f9fb8c..8ff496082902b1ee713e806140f39b4730ed256a 100644
> > > > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
> > > > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
> > > > @@ -42,9 +42,6 @@ static int dpu_wb_conn_atomic_check(struct drm_connector *connector,
> > > > if (!conn_state || !conn_state->connector) {
> > > > DPU_ERROR("invalid connector state\n");
> > > > return -EINVAL;
> > > > - } else if (conn_state->connector->status != connector_status_connected) {
> > > > - DPU_ERROR("connector not connected %d\n", conn_state->connector->status);
> > > > - return -EINVAL;
> > > > }
> > >
> > > Can you please explain me about why the status was not already connected?
> > >
> > > In all the places I checked (unless I missed something), if the detect()
> > > callback is not registered, the connector is assumed connected like below:
> > >
> > > 404 if (funcs->detect_ctx)
> > > 405 ret = funcs->detect_ctx(connector, ctx, force);
> > > 406 else if (connector->funcs->detect)
> > > 407 ret = connector->funcs->detect(connector, force);
> > > 408 else
> > > 409 ret = connector_status_connected;
> > >
> > > We do not register .detect for WB as WB connector should be always
> > > connected.
> > >
> > > What scenario or use-case is making the connector status to something
> > > other than connected?
> > >
> > > The places which mark the connector as unknown seem to be covered by
> > > warnings in drm_probe_helper.c but the bug report doesnt seem to be
> > > hitting those.
> >
> > Because nobody asks for the getconnector on that connector. For
> > example,drm_client_for_each_connector_iter() explicitly skips
> > WRITEBACK connectors. So, drm_client_modeset_probe() also doesn't
> > request ->fill_modes() on that connector.
> >
> > I'm almost sure that if somebody runs a `modetest -ac` on the
> > unpatched kernel after boot, there will be no suspend-related issues.
> > In fact, I've just checked on RB5.
> > /sys/class/drm/card0-Writeback-1/status reports 'unknown' before and
> > 'connected' afterwards. You can easily replicate that on your
> > hardware.
> >
>
> Yes this is correct, I just checked on sc7180.
>
> It stays at unknown till we run IGT. This matches your explanation
> perfectly.
>
> > >
> > > I am wondering if there is some case in fwk resetting this to unknown
> > > silently (which is incorrect) and perhaps other drivers dont hit this as
> > > there is a .detect registered which always returns connected and MSM
> > > should follow that.
> > >
> > > 111 static enum drm_connector_status
> > > 112 komeda_wb_connector_detect(struct drm_connector *connector, bool force)
> > > 113 {
> > > 114 return connector_status_connected;
> > > 115 }
> > > 116
> >
> > No, that won't help. You can add a detect() callback and verify that
> > simply isn't getting called. It's not resetting the connector->status,
> > it's nobody setting it for the first time.
> >
>
> What we found is that drm_atomic_helper_suspend() which calls
> drm_atomic_helper_duplicate_state(), uses drm_for_each_connector_iter()
> which does not rely on the last atomic state but actually uses the
> config->connector_list which in-turn disables all connectors including WB.
>
> Is this expected that even though WB was not really there in the last
> atomic_state before the suspend, still ended up suspending / resuming and
> thus exposing this bug?
Note, atomic_state is a "patch", not a full state. Thus the described
behaviour is expected: it walks over all connectors and checks their
drm_connector_state information. Some of the connectors (if they were
not touched by the commit) might have been skipped from the last
drm_atomic_state structure.
>
> I am now more convinced of this change as I understand the flow better. But
> wanted to highlight above observation.
>
> > > >
> > > > crtc = conn_state->crtc;
> > > >
> > > > ---
> > > > base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
> > > > change-id: 20240709-dpu-fix-wb-6cd57e3eb182
> > > >
> > > > Best regards,
> >
> >
> >
--
With best wishes
Dmitry
Powered by blists - more mailing lists