[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <813d07f7-b430-4c95-bac3-931188415593@amd.com>
Date: Thu, 30 Oct 2025 18:14:17 +0100
From: Christian König <christian.koenig@....com>
To: Marco Crivellari <marco.crivellari@...e.com>,
linux-kernel@...r.kernel.org, amd-gfx@...ts.freedesktop.org,
dri-devel@...ts.freedesktop.org
Cc: Tejun Heo <tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>,
Frederic Weisbecker <frederic@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Michal Hocko <mhocko@...e.com>, Alex Deucher <alexander.deucher@....com>,
David Airlie <airlied@...il.com>, Simona Vetter <simona@...ll.ch>
Subject: Re: [PATCH 1/4] drm/amdgpu: replace use of system_unbound_wq with
system_dfl_wq
On 10/30/25 17:10, Marco Crivellari wrote:
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
>
> This lack of consistency cannot be addressed without refactoring the API.
>
> system_unbound_wq should be the default workqueue so as not to enforce
> locality constraints for random work whenever it's not required.
>
> Adding system_dfl_wq to encourage its use when unbound work should be used.
>
> The old system_unbound_wq will be kept for a few release cycles.
In all the cases below we actually want the work to run on a different CPU than the current one.
So using system_unbound_wq seems to be more appropriate.
Regards,
Christian.
>
> Suggested-by: Tejun Heo <tj@...nel.org>
> Signed-off-by: Marco Crivellari <marco.crivellari@...e.com>
> ---
> drivers/gpu/drm/amd/amdgpu/aldebaran.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 2 +-
> drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c | 2 +-
> 3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/amd/amdgpu/aldebaran.c b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> index 9569dc16dd3d..7957e6c4c416 100644
> --- a/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> +++ b/drivers/gpu/drm/amd/amdgpu/aldebaran.c
> @@ -175,7 +175,7 @@ aldebaran_mode2_perform_reset(struct amdgpu_reset_control *reset_ctl,
> list_for_each_entry(tmp_adev, reset_device_list, reset_list) {
> /* For XGMI run all resets in parallel to speed up the process */
> if (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {
> - if (!queue_work(system_unbound_wq,
> + if (!queue_work(system_dfl_wq,
> &tmp_adev->reset_cntl->reset_work))
> r = -EALREADY;
> } else
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> index 7a899fb4de29..8c4d79f6c14f 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
> @@ -6033,7 +6033,7 @@ int amdgpu_do_asic_reset(struct list_head *device_list_handle,
> list_for_each_entry(tmp_adev, device_list_handle, reset_list) {
> /* For XGMI run all resets in parallel to speed up the process */
> if (tmp_adev->gmc.xgmi.num_physical_nodes > 1) {
> - if (!queue_work(system_unbound_wq,
> + if (!queue_work(system_dfl_wq,
> &tmp_adev->xgmi_reset_work))
> r = -EALREADY;
> } else
> diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> index 28c4ad62f50e..9c4631608526 100644
> --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_reset.c
> @@ -116,7 +116,7 @@ static int amdgpu_reset_xgmi_reset_on_init_perform_reset(
> /* Mode1 reset needs to be triggered on all devices together */
> list_for_each_entry(tmp_adev, reset_device_list, reset_list) {
> /* For XGMI run all resets in parallel to speed up the process */
> - if (!queue_work(system_unbound_wq, &tmp_adev->xgmi_reset_work))
> + if (!queue_work(system_dfl_wq, &tmp_adev->xgmi_reset_work))
> r = -EALREADY;
> if (r) {
> dev_err(tmp_adev->dev,
Powered by blists - more mailing lists