linux-kernel - Re: [PATCH] drm/msm/dpu: Force disabling commits to take non-async path

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <f9863c54-ca35-476b-b6aa-0eb1c18ec29e@quicinc.com>
Date: Wed, 8 Jan 2025 19:14:51 -0800
From: Abhinav Kumar <quic_abhinavk@...cinc.com>
To: Rob Clark <robdclark@...il.com>
CC: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>,
        Jessica Zhang
	<quic_jesszhan@...cinc.com>,
        Sean Paul <sean@...rly.run>,
        Marijn Suijten
	<marijn.suijten@...ainline.org>,
        David Airlie <airlied@...il.com>, "Simona
 Vetter" <simona@...ll.ch>,
        Rob Clark <robdclark@...omium.org>, <linux-arm-msm@...r.kernel.org>,
        <dri-devel@...ts.freedesktop.org>, <freedreno@...ts.freedesktop.org>,
        <linux-kernel@...r.kernel.org>
Subject: Re: [PATCH] drm/msm/dpu: Force disabling commits to take non-async
 path



On 1/8/2025 7:04 PM, Rob Clark wrote:
> On Wed, Jan 8, 2025 at 6:22 PM Abhinav Kumar <quic_abhinavk@...cinc.com> wrote:
>>
>>
>>
>> On 1/8/2025 6:14 PM, Dmitry Baryshkov wrote:
>>> On Thu, 9 Jan 2025 at 03:45, Rob Clark <robdclark@...il.com> wrote:
>>>>
>>>> On Wed, Jan 8, 2025 at 2:58 PM Jessica Zhang <quic_jesszhan@...cinc.com> wrote:
>>>>>
>>>>> Force commit that are disabling a plane in the async_crtc to take the
>>>>> non-async commit tail path.
>>>>>
>>>>> In cases where there are two consecutive async cursor updates (one
>>>>> regular non-NULL update followed by a disabling NULL FB update), it is
>>>>> possible for the second NULL update to not be queued (due to the
>>>>> pending_crtc_mask check) or otherwise not be run before the cursor FB is
>>>>> deallocated by drm_atomic_helper_cleanup_planes(). This would cause a
>>>>> context fault since the hardware would try to fetch the old plane state
>>>>> with the stale FB address.
>>>>>
>>>>> Avoid this issue by forcing cursor updates that will disable the cursor
>>>>> plane to be blocking commits. This will ensure that hardware clears and
>>>>> stops fetching the FB source address before the driver deallocates the FB
>>>>>
>>>>> Fixes: 2d99ced787e3 ("drm/msm: async commit support")
>>>>> Signed-off-by: Jessica Zhang <quic_jesszhan@...cinc.com>
>>>>> ---
>>>>>    drivers/gpu/drm/msm/msm_atomic.c | 13 +++++++++++++
>>>>>    1 file changed, 13 insertions(+)
>>>>>
>>>>> diff --git a/drivers/gpu/drm/msm/msm_atomic.c b/drivers/gpu/drm/msm/msm_atomic.c
>>>>> index 9c45d641b5212c11078ab38c13a519663d85e10a..ddc74c68148c643d34ca631dd28d4cdc2b8c7dc0 100644
>>>>> --- a/drivers/gpu/drm/msm/msm_atomic.c
>>>>> +++ b/drivers/gpu/drm/msm/msm_atomic.c
>>>>> @@ -142,6 +142,7 @@ static bool can_do_async(struct drm_atomic_state *state,
>>>>>           struct drm_connector_state *connector_state;
>>>>>           struct drm_connector *connector;
>>>>>           struct drm_crtc_state *crtc_state;
>>>>> +       struct drm_plane_state *plane_state;
>>>>>           struct drm_crtc *crtc;
>>>>>           int i, num_crtcs = 0;
>>>>>
>>>>> @@ -162,6 +163,18 @@ static bool can_do_async(struct drm_atomic_state *state,
>>>>>                   *async_crtc = crtc;
>>>>>           }
>>>>>
>>>>> +       /*
>>>>> +        * Force a blocking commit if the cursor is being disabled. This is to
>>>>> +        * ensure that the registers are cleared and hardware doesn't try to
>>>>> +        * fetch from a stale address.
>>>>> +        */
>>>>> +       if (*async_crtc) {
>>>>> +               plane_state = drm_atomic_get_new_plane_state(state,
>>>>> +                                                            (*async_crtc)->cursor);
>>>>> +               if (plane_state && !plane_state->fb)
>>>>> +                       return false;
>>>>
>>>> hmm, I suppose we want the same even if the fb changes?  Or
>>>> alternatively somewhere hold an extra ref to the backing obj until hw
>>>> has finished scanout?
>>>
>>
>> Hi Rob
>>
>> Do you mean we need to also check if old_plane_state->fb !=
>> new_plane_state->fb, then use blocking commit?
> 
> yeah, basically.. if we release any outgoing fb the backing bo could
> be potentially freed+unmapped leading to the same problem.
> 

Yeah true, this case also we can hit this. Will add it and check.

> idk if this more conservative approach would cause fps issues..
> holding an extra ref would avoid potential issues, but offhand I'm not
> sure if it would be a perf problem in practice.  Maybe with animated
> cursors?
> 

hmmm.... we did not see any significant lags or drops when we tested 
this (that was also our major concern to make sure we dont)

If we do have to hold an extra ref, we will have to do something like below:

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c 
b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
index 098abc2c0003..97d9e056038c 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_plane.c
@@ -649,8 +649,10 @@ static int dpu_plane_prepare_fb(struct drm_plane 
*plane,
         struct dpu_kms *kms = _dpu_plane_get_kms(&pdpu->base);
         int ret;

-       if (!new_state->fb)
+       if (!new_state->fb) {
+               refcount_inc(&msm_fb->dirtyfb);
                 return 0;
+       }

         DPU_DEBUG_PLANE(pdpu, "FB[%u]\n", fb->base.id);

@@ -682,8 +684,10 @@ static void dpu_plane_cleanup_fb(struct drm_plane 
*plane,
         struct dpu_plane *pdpu = to_dpu_plane(plane);
         struct dpu_plane_state *old_pstate;

-       if (!old_state || !old_state->fb)
+       if (!old_state || !old_state->fb) {
+               refcount_dec(&msm_fb->dirtyfb);
                 return;
+       }

         old_pstate = to_dpu_plane_state(old_state);

I dont know if this is clean though. WDYT?


> BR,
> -R
> 
>> We can try that out.
>>
>> holding extra ref gets tricky IMO. In this way, the calls are balanced
>> in places we know.
>>
>>> I think a more correct approach would be to run a worker, waiting for
>>> the commit to happen and then freeing the FBs.
>>>
>>
>> Hi Dmitry
>>
>> This option was tried . It gets very messy to handle it this way. Then
>> we realized that, the worker is going to try to do the same thing a
>> blocking commit does which is to wait for hw to finish scanout and then
>> cleanup planes. Hence this was preferred and is better IMO.
>>
>>>>
>>>> BR,
>>>> -R
>>>>
>>>>> +       }
>>>>> +
>>>>>           return true;
>>>>>    }
>>>>>
>>>>>
>>>>> ---
>>>>> base-commit: 866e43b945bf98f8e807dfa45eca92f931f3a032
>>>>> change-id: 20250108-async-disable-fix-cc1b9a1d5b19
>>>>>
>>>>> Best regards,
>>>>> --
>>>>> Jessica Zhang <quic_jesszhan@...cinc.com>
>>>>>
>>>
>>>
>>>