[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <20250819-maden-beteuern-82c84504d0b3@brauner>
Date: Tue, 19 Aug 2025 13:23:26 +0200
From: Christian Brauner <brauner@...nel.org>
To: Marco Crivellari <marco.crivellari@...e.com>,
Tejun Heo <tj@...nel.org>
Cc: linux-kernel@...r.kernel.org, Lai Jiangshan <jiangshanlai@...il.com>,
Thomas Gleixner <tglx@...utronix.de>, Frederic Weisbecker <frederic@...nel.org>,
Sebastian Andrzej Siewior <bigeasy@...utronix.de>, Michal Hocko <mhocko@...e.com>,
Alexander Viro <viro@...iv.linux.org.uk>
Subject: Re: [PATCH 0/2] Workqueue: fs: replace use of system_wq and add
WQ_PERCPU to alloc_workqueue users
On Fri, Aug 15, 2025 at 11:47:13AM +0200, Marco Crivellari wrote:
> Hello!
>
> Below is a summary of a discussion about the Workqueue API and cpu isolation
> considerations. Details and more information are available here:
>
> "workqueue: Always use wq_select_unbound_cpu() for WORK_CPU_UNBOUND."
> https://lore.kernel.org/all/20250221112003.1dSuoGyc@linutronix.de/
>
> === Current situation: problems ===
>
> Let's consider a nohz_full system with isolated CPUs: wq_unbound_cpumask is
> set to the housekeeping CPUs, for !WQ_UNBOUND the local CPU is selected.
>
> This leads to different scenarios if a work item is scheduled on an isolated
> CPU where "delay" value is 0 or greater then 0:
> schedule_delayed_work(, 0);
>
> This will be handled by __queue_work() that will queue the work item on the
> current local (isolated) CPU, while:
>
> schedule_delayed_work(, 1);
>
> Will move the timer on an housekeeping CPU, and schedule the work there.
>
> Currently if a user enqueue a work item using schedule_delayed_work() the
> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
> schedule_work() that is using system_wq and queue_work(), that makes use
> again of WORK_CPU_UNBOUND.
>
> This lack of consistentcy cannot be addressed without refactoring the API.
>
> === Plan and future plans ===
>
> This patchset is the first stone on a refactoring needed in order to
> address the points aforementioned; it will have a positive impact also
> on the cpu isolation, in the long term, moving away percpu workqueue in
> favor to an unbound model.
>
> These are the main steps:
> 1) API refactoring (that this patch is introducing)
> - Make more clear and uniform the system wq names, both per-cpu and
> unbound. This to avoid any possible confusion on what should be
> used.
>
> - Introduction of WQ_PERCPU: this flag is the complement of WQ_UNBOUND,
> introduced in this patchset and used on all the callers that are not
> currently using WQ_UNBOUND.
>
> WQ_UNBOUND will be removed in a future release cycle.
>
> Most users don't need to be per-cpu, because they don't have
> locality requirements, because of that, a next future step will be
> make "unbound" the default behavior.
>
> 2) Check who really needs to be per-cpu
> - Remove the WQ_PERCPU flag when is not strictly required.
>
> 3) Add a new API (prefer local cpu)
> - There are users that don't require a local execution, like mentioned
> above; despite that, local execution yeld to performance gain.
>
> This new API will prefer the local execution, without requiring it.
>
> === Introduced Changes by this patchset ===
>
> 1) [P 1] replace use of system_wq with system_percpu_wq (under fs)
>
> system_wq is a per-CPU workqueue, but his name is not clear.
> system_unbound_wq is to be used when locality is not required.
>
> Because of that, system_wq has been renamed in system_percpu_wq in the
> fs subsystm (details in the next section).
>
> 2) [P 2] add WQ_PERCPU to alloc_workqueue() users (under fs)
>
> Every alloc_workqueue() caller should use one among WQ_PERCPU or
> WQ_UNBOUND. This is actually enforced warning if both or none of them
> are present at the same time.
>
> These patches introduce WQ_PERCPU in the fs subsystem
> (details in the next section).
>
> WQ_UNBOUND will be removed in a next release cycle.
>
> === For fs Maintainers ===
>
> If you agree with these changes, one option is pull the preparation changes from
> Tejun's wq branch [1].
I'll take it through the vfs-6.18.workqueue branch.
Can I just pull the series from the list so we have all the lore links
and the cover letter?
Powered by blists - more mailing lists