linux-kernel - Re: [PATCH v3 06/10] drm/xe/xe_late_bind_fw: Reload late binding fw in rpm resume

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2c4f410a-3abd-4abc-84c8-13e7e3b40a73@intel.com>
Date: Wed, 18 Jun 2025 14:05:05 -0700
From: Daniele Ceraolo Spurio <daniele.ceraolospurio@...el.com>
To: Badal Nilawar <badal.nilawar@...el.com>, <intel-xe@...ts.freedesktop.org>,
	<dri-devel@...ts.freedesktop.org>, <linux-kernel@...r.kernel.org>
CC: <anshuman.gupta@...el.com>, <rodrigo.vivi@...el.com>,
	<alexander.usyskin@...el.com>, <gregkh@...uxfoundation.org>, <jgg@...dia.com>
Subject: Re: [PATCH v3 06/10] drm/xe/xe_late_bind_fw: Reload late binding fw
 in rpm resume



On 6/18/2025 12:00 PM, Badal Nilawar wrote:
> Reload late binding fw during runtime resume.
>
> v2: Flush worker during runtime suspend
>
> Signed-off-by: Badal Nilawar <badal.nilawar@...el.com>
> ---
>   drivers/gpu/drm/xe/xe_late_bind_fw.c | 2 +-
>   drivers/gpu/drm/xe/xe_late_bind_fw.h | 1 +
>   drivers/gpu/drm/xe/xe_pm.c           | 6 ++++++
>   3 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.c b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> index 54aa08c6bdfd..c0be9611c73b 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.c
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.c
> @@ -58,7 +58,7 @@ static int xe_late_bind_fw_num_fans(struct xe_late_bind *late_bind)
>   		return 0;
>   }
>   
> -static void xe_late_bind_wait_for_worker_completion(struct xe_late_bind *late_bind)
> +void xe_late_bind_wait_for_worker_completion(struct xe_late_bind *late_bind)
>   {
>   	struct xe_device *xe = late_bind_to_xe(late_bind);
>   	struct xe_late_bind_fw *lbfw;
> diff --git a/drivers/gpu/drm/xe/xe_late_bind_fw.h b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> index 28d56ed2bfdc..07e437390539 100644
> --- a/drivers/gpu/drm/xe/xe_late_bind_fw.h
> +++ b/drivers/gpu/drm/xe/xe_late_bind_fw.h
> @@ -12,5 +12,6 @@ struct xe_late_bind;
>   
>   int xe_late_bind_init(struct xe_late_bind *late_bind);
>   int xe_late_bind_fw_load(struct xe_late_bind *late_bind);
> +void xe_late_bind_wait_for_worker_completion(struct xe_late_bind *late_bind);
>   
>   #endif
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index ff749edc005b..91923fd4af80 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -20,6 +20,7 @@
>   #include "xe_gt.h"
>   #include "xe_guc.h"
>   #include "xe_irq.h"
> +#include "xe_late_bind_fw.h"
>   #include "xe_pcode.h"
>   #include "xe_pxp.h"
>   #include "xe_trace.h"
> @@ -460,6 +461,8 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
>   	if (err)
>   		goto out;
>   
> +	xe_late_bind_wait_for_worker_completion(&xe->late_bind);

I thing this can deadlock, because you do an rpm_put from within the 
worker and if that's the last put it'll end up here and wait for the 
worker to complete.
We could probably just skip this wait, because the worker can handle rpm 
itself. What we might want to be careful about is to nor re-queue it 
(from xe_late_bind_fw_load below) if it's currently being executed; we 
could also just let the fw be loaded twice if we hit that race 
condition, that shouldn't be an issue apart from doing something not needed.

Daniele

> +
>   	/*
>   	 * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
>   	 * also checks and deletes bo entry from user fault list.
> @@ -550,6 +553,9 @@ int xe_pm_runtime_resume(struct xe_device *xe)
>   
>   	xe_pxp_pm_resume(xe->pxp);
>   
> +	if (xe->d3cold.allowed)
> +		xe_late_bind_fw_load(&xe->late_bind);
> +
>   out:
>   	xe_rpm_lockmap_release(xe);
>   	xe_pm_write_callback_task(xe, NULL);