linux-kernel - Re: [PATCH 5/5] drm/msm/dpu: rate limit snapshot capture for mmu faults

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <5isw7c5kkef4kql4qcous3gmwhvgwc53ntgjm4staymqr67ktm@iw3cr2gr2iko>
Date: Mon, 1 Jul 2024 22:43:16 +0300
From: Dmitry Baryshkov <dmitry.baryshkov@...aro.org>
To: Abhinav Kumar <quic_abhinavk@...cinc.com>
Cc: freedreno@...ts.freedesktop.org, Rob Clark <robdclark@...il.com>, 
	Sean Paul <sean@...rly.run>, Marijn Suijten <marijn.suijten@...ainline.org>, 
	David Airlie <airlied@...il.com>, Daniel Vetter <daniel@...ll.ch>, dri-devel@...ts.freedesktop.org, 
	quic_jesszhan@...cinc.com, swboyd@...omium.org, dianders@...omium.org, 
	linux-arm-msm@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH 5/5] drm/msm/dpu: rate limit snapshot capture for mmu
 faults

On Fri, Jun 28, 2024 at 02:48:47PM GMT, Abhinav Kumar wrote:
> There is no recovery mechanism in place yet to recover from mmu
> faults for DPU. We can only prevent the faults by making sure there
> is no misconfiguration.
> 
> Rate-limit the snapshot capture for mmu faults to once per
> msm_kms_init_aspace() as that should be sufficient to capture
> the snapshot for debugging otherwise there will be a lot of
> dpu snapshots getting captured for the same fault which is
> redundant and also might affect capturing even one snapshot
> accurately.

Please squash this into the first patch. There is no need to add code
with a known defficiency.

Also, is there a reason why you haven't used <linux/ratelimit.h> ?

> 
> Signed-off-by: Abhinav Kumar <quic_abhinavk@...cinc.com>
> ---
>  drivers/gpu/drm/msm/msm_kms.c | 6 +++++-
>  drivers/gpu/drm/msm/msm_kms.h | 3 +++
>  2 files changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/gpu/drm/msm/msm_kms.c b/drivers/gpu/drm/msm/msm_kms.c
> index d5d3117259cf..90a333920c01 100644
> --- a/drivers/gpu/drm/msm/msm_kms.c
> +++ b/drivers/gpu/drm/msm/msm_kms.c
> @@ -168,7 +168,10 @@ static int msm_kms_fault_handler(void *arg, unsigned long iova, int flags, void
>  {
>  	struct msm_kms *kms = arg;
>  
> -	msm_disp_snapshot_state(kms->dev);
> +	if (!kms->fault_snapshot_capture) {
> +		msm_disp_snapshot_state(kms->dev);
> +		kms->fault_snapshot_capture++;

When is it decremented?

> +	}
>  
>  	return -ENOSYS;
>  }
> @@ -208,6 +211,7 @@ struct msm_gem_address_space *msm_kms_init_aspace(struct drm_device *dev)
>  		mmu->funcs->destroy(mmu);
>  	}
>  
> +	kms->fault_snapshot_capture = 0;
>  	msm_mmu_set_fault_handler(aspace->mmu, kms, msm_kms_fault_handler);
>  
>  	return aspace;
> diff --git a/drivers/gpu/drm/msm/msm_kms.h b/drivers/gpu/drm/msm/msm_kms.h
> index 1e0c54de3716..240b39e60828 100644
> --- a/drivers/gpu/drm/msm/msm_kms.h
> +++ b/drivers/gpu/drm/msm/msm_kms.h
> @@ -134,6 +134,9 @@ struct msm_kms {
>  	int irq;
>  	bool irq_requested;
>  
> +	/* rate limit the snapshot capture to once per attach */
> +	int fault_snapshot_capture;
> +
>  	/* mapper-id used to request GEM buffer mapped for scanout: */
>  	struct msm_gem_address_space *aspace;
>  
> -- 
> 2.44.0
> 

-- 
With best wishes
Dmitry