[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <5isw7c5kkef4kql4qcous3gmwhvgwc53ntgjm4staymqr67ktm@iw3cr2gr2iko>
Date: Mon, 1 Jul 2024 22:43:16 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc: freedreno@...ts.freedesktop.org, Rob Clark <robdclark@...il.com>,
Sean Paul <sean@...rly.run>, Marijn Suijten <marijn.suijten@...ainline.org>,
David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>, dri-devel@...ts.freedesktop.org,
quic_jesszhan@...cinc.com, swboyd@...omium.org, dianders@...omium.org,
linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] drm/msm/dpu: rate limit snapshot capture for mmu
faults
On Fri, Jun 28, 2024 at 02:48:47PM GMT, Abhinav Kumar wrote:
> There is no recovery mechanism in place yet to recover from mmu
> faults for DPU. We can only prevent the faults by making sure there
> is no misconfiguration.
>
> Rate-limit the snapshot capture for mmu faults to once per
> msm_kms_init_aspace() as that should be sufficient to capture
> the snapshot for debugging otherwise there will be a lot of
> dpu snapshots getting captured for the same fault which is
> redundant and also might affect capturing even one snapshot
> accurately.
Please squash this into the first patch. There is no need to add code
with a known defficiency.
Also, is there a reason why you haven't used <linux/ratelimit.h> ?
>
> Signed-off-by: Abhinav Kumar <quic_abhinavk@...cinc.com>
> ---
> drivers/gpu/drm/msm/msm_kms.c | 6 +++++-
> drivers/gpu/drm/msm/msm_kms.h | 3 +++
> 2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
> index d5d3117259cf..90a333920c01 100644
> --- a/drivers/gpu/drm/msm/msm_kms.c
> +++ b/drivers/gpu/drm/msm/msm_kms.c
> @@ -168,7 +168,10 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
> {
> struct msm_kms *kms = arg;
>
> - msm_disp_snapshot_state(kms->dev);
> + if (!kms->fault_snapshot_capture) {
> + msm_disp_snapshot_state(kms->dev);
> + kms->fault_snapshot_capture++;
When is it decremented?
> + }
>
> return -ENOSYS;
> }
> @@ -208,6 +211,7 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev)
> mmu->funcs->destroy(mmu);
> }
>
> + kms->fault_snapshot_capture = 0;
> msm_mmu_set_fault_handler(aspace->mmu, kms, msm_kms_fault_handler);
>
> return aspace;
> diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
> index 1e0c54de3716..240b39e60828 100644
> --- a/drivers/gpu/drm/msm/msm_kms.h
> +++ b/drivers/gpu/drm/msm/msm_kms.h
> @@ -134,6 +134,9 @@ struct msm_kms {
> int irq;
> bool irq_requested;
>
> + /* rate limit the snapshot capture to once per attach */
> + int fault_snapshot_capture;
> +
> /* mapper-id used to request GEM buffer mapped for scanout: */
> struct msm_gem_address_space *aspace;
>
> --
> 2.44.0
>
--
With best wishes
Dmitry
Powered by blists - more mailing lists