lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite for Android: free password hash cracker in your pocket
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <65f9b40f-3ed1-4842-8992-3be14137aada@quicinc.com>
Date: Mon, 9 Dec 2024 16:11:33 -0800
From: Abhinav Kumar <quic_abhinavk@...cinc.com>
To: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
CC: Rob Clark <robdclark@...il.com>, Sean Paul <sean@...rly.run>,
        Marijn
 Suijten <marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>,
        Simona Vetter <simona.vetter@...ll.ch>,
        <linux-arm-msm@...r.kernel.org>, <dri-devel@...ts.freedesktop.org>,
        <freedreno@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>,
        <stable@...r.kernel.org>, Leonard Lausen <leonard@...sen.nl>,
        György Kurucz <me@...uczgy.com>,
        Johan Hovold
	<johan+linaro@...nel.org>
Subject: Re: [PATCH v4] drm/msm/dpu1: don't choke on disabling the writeback
 connector



On 12/9/2024 4:07 PM, Dmitry Baryshkov wrote:
> On Tue, 10 Dec 2024 at 02:03, Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
>>
>>
>>
>> On 12/9/2024 3:51 PM, Dmitry Baryshkov wrote:
>>> On Mon, Dec 09, 2024 at 03:36:29PM -0800, Abhinav Kumar wrote:
>>>>
>>>>
>>>> On 12/9/2024 1:55 PM, Dmitry Baryshkov wrote:
>>>>> On Mon, 9 Dec 2024 at 21:54, Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 12/9/2024 2:04 AM, Dmitry Baryshkov wrote:
>>>>>>> During suspend/resume process all connectors are explicitly disabled and
>>>>>>> then reenabled. However resume fails because of the connector_status check:
>>>>>>>
>>>>>>> [dpu error]connector not connected 3
>>>>>>> [drm:drm_mode_config_helper_resume [drm_kms_helper]] *ERROR* Failed to resume (-22)
>>>>>>>
>>>>>>> It doesn't make sense to check for the Writeback connected status (and
>>>>>>> other drivers don't perform such check), so drop the check.
>>>>>>>
>>>>>>> It wasn't a problem before the commit 71174f362d67 ("drm/msm/dpu: move
>>>>>>> writeback's atomic_check to dpu_writeback.c"), since encoder's
>>>>>>> atomic_check() is called under a different conditions that the
>>>>>>> connector's atomic_check() (e.g. it is not called if there is no
>>>>>>> connected CRTC or if the corresponding connector is not a part of the
>>>>>>> new state).
>>>>>>>
>>>>>>> Fixes: 71174f362d67 ("drm/msm/dpu: move writeback's atomic_check to dpu_writeback.c")
>>>>>>> Cc: stable@...r.kernel.org
>>>>>>> Reported-by: Leonard Lausen <leonard@...sen.nl>
>>>>>>> Closes: https://gitlab.freedesktop.org/drm/msm/-/issues/57
>>>>>>> Tested-by: Leonard Lausen <leonard@...sen.nl> # on sc7180 lazor
>>>>>>> Reported-by: György Kurucz <me@...uczgy.com>
>>>>>>> Link: https://lore.kernel.org/all/b70a4d1d-f98f-4169-942c-cb9006a42b40@kuruczgy.com/
>>>>>>> Reported-by: Johan Hovold <johan+linaro@...nel.org>
>>>>>>> Link: https://lore.kernel.org/all/ZzyYI8KkWK36FfXf@hovoldconsulting.com/
>>>>>>> Tested-by: György Kurucz <me@...uczgy.com>
>>>>>>> Reviewed-by: Johan Hovold <johan+linaro@...nel.org>
>>>>>>> Tested-by: Johan Hovold <johan+linaro@...nel.org>
>>>>>>> Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
>>>>>>> ---
>>>>>>> Leonard Lausen reported an issue with suspend/resume of the sc7180
>>>>>>> devices. Fix the WB atomic check, which caused the issue.
>>>>>>> ---
>>>>>>> Changes in v4:
>>>>>>> - Epanded commit message (Johan)
>>>>>>> - Link to v3: https://lore.kernel.org/r/20241208-dpu-fix-wb-v3-1-a1de69ce4a1b@linaro.org
>>>>>>>
>>>>>>> Changes in v3:
>>>>>>> - Rebased on top of msm-fixes
>>>>>>> - Link to v2: https://lore.kernel.org/r/20240802-dpu-fix-wb-v2-0-7eac9eb8e895@linaro.org
>>>>>>>
>>>>>>> Changes in v2:
>>>>>>> - Reworked the writeback to just drop the connector->status check.
>>>>>>> - Expanded commit message for the debugging patch.
>>>>>>> - Link to v1: https://lore.kernel.org/r/20240709-dpu-fix-wb-v1-0-448348bfd4cb@linaro.org
>>>>>>> ---
>>>>>>>      drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c | 3 ---
>>>>>>>      1 file changed, 3 deletions(-)
>>>>>>>
>>>>>>> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
>>>>>>> index 16f144cbc0c986ee266412223d9e605b01f9fb8c..8ff496082902b1ee713e806140f39b4730ed256a 100644
>>>>>>> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
>>>>>>> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_writeback.c
>>>>>>> @@ -42,9 +42,6 @@ static int dpu_wb_conn_atomic_check(struct drm_connector *connector,
>>>>>>>          if (!conn_state || !conn_state->connector) {
>>>>>>>                  DPU_ERROR("invalid connector state\n");
>>>>>>>                  return -EINVAL;
>>>>>>> -     } else if (conn_state->connector->status != connector_status_connected) {
>>>>>>> -             DPU_ERROR("connector not connected %d\n", conn_state->connector->status);
>>>>>>> -             return -EINVAL;
>>>>>>>          }
>>>>>>
>>>>>> Can you please explain me about why the status was not already connected?
>>>>>>
>>>>>> In all the places I checked (unless I missed something), if the detect()
>>>>>> callback is not registered, the connector is assumed connected like below:
>>>>>>
>>>>>> 404     if (funcs->detect_ctx)
>>>>>> 405             ret = funcs->detect_ctx(connector, ctx, force);
>>>>>> 406     else if (connector->funcs->detect)
>>>>>> 407             ret = connector->funcs->detect(connector, force);
>>>>>> 408     else
>>>>>> 409             ret = connector_status_connected;
>>>>>>
>>>>>> We do not register .detect for WB as WB connector should be always
>>>>>> connected.
>>>>>>
>>>>>> What scenario or use-case is making the connector status to something
>>>>>> other than connected?
>>>>>>
>>>>>> The places which mark the connector as unknown seem to be covered by
>>>>>> warnings in drm_probe_helper.c but the bug report doesnt seem to be
>>>>>> hitting those.
>>>>>
>>>>> Because nobody asks for the getconnector on that connector. For
>>>>> example,drm_client_for_each_connector_iter() explicitly skips
>>>>> WRITEBACK connectors. So, drm_client_modeset_probe() also doesn't
>>>>> request ->fill_modes() on that connector.
>>>>>
>>>>> I'm almost sure that if somebody runs a `modetest -ac` on the
>>>>> unpatched kernel after boot, there will be no suspend-related issues.
>>>>> In fact, I've just checked on RB5.
>>>>> /sys/class/drm/card0-Writeback-1/status reports 'unknown' before and
>>>>> 'connected' afterwards. You can easily replicate that on your
>>>>> hardware.
>>>>>
>>>>
>>>> Yes this is correct, I just checked on sc7180.
>>>>
>>>> It stays at unknown till we run IGT. This matches your explanation
>>>> perfectly.
>>>>
>>>>>>
>>>>>> I am wondering if there is some case in fwk resetting this to unknown
>>>>>> silently (which is incorrect) and perhaps other drivers dont hit this as
>>>>>> there is a .detect registered which always returns connected and MSM
>>>>>> should follow that.
>>>>>>
>>>>>> 111 static enum drm_connector_status
>>>>>> 112 komeda_wb_connector_detect(struct drm_connector *connector, bool force)
>>>>>> 113 {
>>>>>> 114     return connector_status_connected;
>>>>>> 115 }
>>>>>> 116
>>>>>
>>>>> No, that won't help. You can add a detect() callback and verify that
>>>>> simply isn't getting called. It's not resetting the connector->status,
>>>>> it's nobody setting it for the first time.
>>>>>
>>>>
>>>> What we found is that drm_atomic_helper_suspend() which calls
>>>> drm_atomic_helper_duplicate_state(), uses drm_for_each_connector_iter()
>>>> which does not rely on the last atomic state but actually uses the
>>>> config->connector_list which in-turn disables all connectors including WB.
>>>>
>>>> Is this expected that even though WB was not really there in the last
>>>> atomic_state before the suspend, still ended up suspending / resuming and
>>>> thus exposing this bug?
>>>
>>> Note, atomic_state is a "patch", not a full state. Thus the described
>>> behaviour is expected: it walks over all connectors and checks their
>>> drm_connector_state information. Some of the connectors (if they were
>>> not touched by the commit) might have been skipped from the last
>>> drm_atomic_state structure.
>>>
>>
>> Yes, I understand the patched part. So what i meant was, we observed
>> that CLIENT_CAP_WRITEBACK was never set which means WB was never exposed
>> as a connector so it could not have been part of any commits. IOW, it
>> would have never been enabled. In that case, is it still right to
>> disable WB connector and enable it again considering that it could not
>> have been enabled at any point before this.
> 
> ... to disable WB connector on suspend and restore its state on resume ...
> 
> Yes, it's correct behaviour. It requires less clumsy code and less
> special cases compared to using some other euristics in order to
> select, which connector was ever enabled.
> 

Acked..it will be harder to track this for sure, I am fine with this now

Reviewed-by: Abhinav Kumar <quic_abhinavk@...cinc.com>

FWIW, we retested kms_writeback with this and it still works as well.

>>
>>>>
>>>> I am  now more convinced of this change as I understand the flow better. But
>>>> wanted to highlight above observation.
>>>>
>>>>>>>
>>>>>>>          crtc = conn_state->crtc;
>>>>>>>
>>>>>>> ---
>>>>>>> base-commit: 86313a9cd152330c634b25d826a281c6a002eb77
>>>>>>> change-id: 20240709-dpu-fix-wb-6cd57e3eb182
>>>>>>>
>>>>>>> Best regards,
>>>>>
>>>>>
>>>>>
>>>
> 
> 
> 

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ