linux-kernel - Re: [PATCH 1/4] drm/amdgpu: replace use of system_unbound_wq with system_dfl

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <813d07f7-b430-4c95-bac3-931188415593@amd.com>
Date: Thu, 30 Oct 2025 18:14:17 +0100
From: Christian König <christian.koenig@....com>
To: Marco Crivellari <marco.crivellari@...e.com>,
 linux-kernel@...r.kernel.org, amd-gfx@...ts.freedesktop.org,
 dri-devel@...ts.freedesktop.org
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
 Frederic Weisbecker <frederic@...nel.org>,
 Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
 Michal Hocko <mhocko@...e.com>, Alex Deucher <alexander.deucher@....com>,
 David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH 1/4] drm/amdgpu: replace use of system_unbound_wq with
 system_dfl_wq

On 10/30/25 17:10, Marco Crivellari wrote:
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
> 
> This lack of consistency cannot be addressed without refactoring the API.
> 
> system_unbound_wq should be the default workqueue so as not to enforce
> locality constraints for random work whenever it's not required.
> 
> Adding system_dfl_wq to encourage its use when unbound work should be used.
> 
> The old system_unbound_wq will be kept for a few release cycles.

In all the cases below we actually want the work to run on a different CPU than the current one.

So using system_unbound_wq seems to be more appropriate.

Regards,
Christian.

> 
> Suggested-by: Tejun Heo <tj@...nel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@...e.com>
> ---
>  drivers/gpu/drm/amd/amdgpu/aldebaran.c     | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c  | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> index 9569dc16dd3d..7957e6c4c416 100644
> --- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> +++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> @@ -175,7 +175,7 @@ aldebaran_mode2_perform_reset(struct amdgpu_reset_control *reset_ctl,
>  	list_for_each_entry(tmp_adev, reset_device_list, reset_list) {
>  		/* For XGMI run all resets in parallel to speed up the process */
>  		if (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {
> -			if (!queue_work(system_unbound_wq,
> +			if (!queue_work(system_dfl_wq,
>  					&tmp_adev->reset_cntl->reset_work))
>  				r = -EALREADY;
>  		} else
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..8c4d79f6c14f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -6033,7 +6033,7 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
>  		list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
>  			/* For XGMI run all resets in parallel to speed up the process */
>  			if (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {
> -				if (!queue_work(system_unbound_wq,
> +				if (!queue_work(system_dfl_wq,
>  						&tmp_adev->xgmi_reset_work))
>  					r = -EALREADY;
>  			} else
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> index 28c4ad62f50e..9c4631608526 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> @@ -116,7 +116,7 @@ static int amdgpu_reset_xgmi_reset_on_init_perform_reset(
>  	/* Mode1 reset needs to be triggered on all devices together */
>  	list_for_each_entry(tmp_adev, reset_device_list, reset_list) {
>  		/* For XGMI run all resets in parallel to speed up the process */
> -		if (!queue_work(system_unbound_wq, &tmp_adev->xgmi_reset_work))
> +		if (!queue_work(system_dfl_wq, &tmp_adev->xgmi_reset_work))
>  			r = -EALREADY;
>  		if (r) {
>  			dev_err(tmp_adev->dev,