[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20250612133335.788593-1-marco.crivellari@suse.com>
Date: Thu, 12 Jun 2025 15:33:32 +0200
From: Marco Crivellari <marco.crivellari@...e.com>
To: linux-kernel@...r.kernel.org
Cc: Tejun Heo <tj@...nel.org>,
Lai Jiangshan <jiangshanlai@...il.com>,
Thomas Gleixner <tglx@...utronix.de>,
Frederic Weisbecker <frederic@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
Marco Crivellari <marco.crivellari@...e.com>,
Michal Hocko <mhocko@...e.com>
Subject: [PATCH v4 0/3] Workqueue: add WQ_PERCPU, system_dfl_wq and system_percpu_wq
Hi!
Below is a summary of a discussion about the Workqueue API and cpu isolation
considerations. Details and more information are available here:
"workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
=== Current situation: problems ===
Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
This leads to different scenarios if a work item is scheduled on an isolated
CPU where "delay" value is 0 or greater then 0:
schedule_delayed_work(, 0);
This will be handled by __queue_work() that will queue the work item on the
current local (isolated) CPU, while:
schedule_delayed_work(, 1);
Will move the timer on an housekeeping CPU, and schedule the work there.
Currently if a user enqueue a work item using schedule_delayed_work() the
used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
schedule_work() that is using system_wq and queue_work(), that makes use
again of WORK_CPU_UNBOUND.
This lack of consistentcy cannot be addressed without refactoring the API.
=== Plan and future plans ===
This patchset is the first stone on a refactoring needed in order to
address the points aforementioned; it will have a positive impact also
on the cpu isolation, in the long term, moving away percpu workqueue in
favor to an unbound model.
These are the main steps:
1) API refactoring (that this patch is introducing)
- Make more clear and uniform the system wq names, both per-cpu and
unbound. This to avoid any possible confusion on what should be
used.
- Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
introduced in this patchset and used on all the callers that are not
currently using WQ_UNBOUND.
WQ_UNBOUND will be removed in a future release cycle.
Most users don't need to be per-cpu, because they don't have
locality requirements, because of that, a next future step will be
make "unbound" the default behavior.
2) Check who really needs to be per-cpu
- Remove the WQ_PERCPU flag when is not strictly required.
3) Add a new API (prefer local cpu)
- There are users that don't require a local execution, like mentioned
above; despite that, local execution yeld to performance gain.
This new API will prefer the local execution, without requiring it.
=== Introduced Changes by this patchset ===
1) [P1] add system_percpu_wq and system_dfl_wq
system_wq is a per-CPU workqueue, but his name is not clear.
system_unbound_wq is to be used when locality is not required.
Because of that, system_percpu_wq and system_dfl_wq have been
introduced in order to replace, in future, system_wq and
system_unbound_wq.
2) [P2] add new WQ_PERCPU flag
This patch adds the new WQ_PERCPU flag to explicitly require to be per-cpu.
WQ_UNBOUND will be removed in a next release cycle.
3) [P3] Doc change about WQ_PERCPU
Added a short section about WQ_PERCPU and a Note under WQ_UNBOUND
mentioning that it will be removed in the future.
---
Changes in v4:
- Take a step back from the previous version, in order to add first the new
wq(s) and the new flag (WQ_PERCPU), addressing later all the other changes.
Changes in v3:
- The introduction of the new wq(s) and the WQ_PERCPU flag have been moved
in separated patches (1 for wq(s) and 1 for WQ_PERCPU).
- WQ_PERCPU is now added to all the alloc_workqueue callers in separated patches
addressing few subsystems first (fs, mm, net).
Changes in v2:
- Introduction of WQ_PERCPU change has been merged with the alloc_workqueue()
patch that pass the WQ_PERCPU flag explicitly to every caller.
- (2 drivers) in the code not matched by Coccinelle; WQ_PERCPU added also there.
- WQ_PERCPU added to __WQ_BH_ALLOWS.
- queue_work() now prints a warning (pr_warn_once()) if a user is using the
old wq and redirect the wrong / old wq to the new one.
- Changes to workqueue.rst about the WQ_PERCPU flag and a Note about the
future of WQ_UNBOUND.
Marco Crivellari (3):
Workqueue: add system_percpu_wq and system_dfl_wq
Workqueue: add new WQ_PERCPU flag
[Doc] Workqueue: add WQ_PERCPU
Documentation/core-api/workqueue.rst | 10 ++++++++++
include/linux/workqueue.h | 9 ++++++---
kernel/workqueue.c | 4 ++++
3 files changed, 20 insertions(+), 3 deletions(-)
--
2.49.0
Powered by blists - more mailing lists