linux-kernel - Re: [PATCH] mmc: core: add WQ_PERCPU to alloc

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [day] [month] [year] [list]

Message-ID: <ec3c6315-cbe2-44bd-a84f-f8f140c1d390@intel.com>
Date: Mon, 17 Nov 2025 12:47:24 +0200
From: Adrian Hunter <adrian.hunter@...el.com>
To: Ulf Hansson <ulf.hansson@...aro.org>
CC: Marco Crivellari <marco.crivellari@...e.com>,
	<linux-kernel@...r.kernel.org>, <linux-mmc@...r.kernel.org>, Tejun Heo
	<tj@...nel.org>, Lai Jiangshan <jiangshanlai@...il.com>, Frederic Weisbecker
	<frederic@...nel.org>, Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
	Michal Hocko <mhocko@...e.com>
Subject: Re: [PATCH] mmc: core: add WQ_PERCPU to alloc_workqueue users

On 12/11/2025 13:45, Ulf Hansson wrote:
> On Wed, 12 Nov 2025 at 07:49, Adrian Hunter <adrian.hunter@...el.com> wrote:
>>
>> On 11/11/2025 19:12, Ulf Hansson wrote:
>>> + Adrian
>>>
>>> On Fri, 7 Nov 2025 at 15:17, Marco Crivellari <marco.crivellari@...e.com> wrote:
>>>>
>>>> Currently if a user enqueues a work item using schedule_delayed_work() the
>>>> used wq is "system_wq" (per-cpu wq) while queue_delayed_work() use
>>>> WORK_CPU_UNBOUND (used when a cpu is not specified). The same applies to
>>>> schedule_work() that is using system_wq and queue_work(), that makes use
>>>> again of WORK_CPU_UNBOUND.
>>>> This lack of consistency cannot be addressed without refactoring the API.
>>>>
>>>> alloc_workqueue() treats all queues as per-CPU by default, while unbound
>>>> workqueues must opt-in via WQ_UNBOUND.
>>>>
>>>> This default is suboptimal: most workloads benefit from unbound queues,
>>>> allowing the scheduler to place worker threads where they’re needed and
>>>> reducing noise when CPUs are isolated.
>>>>
>>>> This continues the effort to refactor workqueue APIs, which began with
>>>> the introduction of new workqueues and a new alloc_workqueue flag in:
>>>>
>>>> commit 128ea9f6ccfb ("workqueue: Add system_percpu_wq and system_dfl_wq")
>>>> commit 930c2ea566af ("workqueue: Add new WQ_PERCPU flag")
>>>>
>>>> This change adds a new WQ_PERCPU flag to explicitly request
>>>> alloc_workqueue() to be per-cpu when WQ_UNBOUND has not been specified.
>>>>
>>>> With the introduction of the WQ_PERCPU flag (equivalent to !WQ_UNBOUND),
>>>> any alloc_workqueue() caller that doesn’t explicitly specify WQ_UNBOUND
>>>> must now use WQ_PERCPU.
>>>>
>>>> Once migration is complete, WQ_UNBOUND can be removed and unbound will
>>>> become the implicit default.
>>>>
>>>> Suggested-by: Tejun Heo <tj@...nel.org>
>>>> Signed-off-by: Marco Crivellari <marco.crivellari@...e.com>
>>>> ---
>>>>  drivers/mmc/core/block.c | 3 ++-
>>>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
>>>> index c0ffe0817fd4..6a651ddccf28 100644
>>>> --- a/drivers/mmc/core/block.c
>>>> +++ b/drivers/mmc/core/block.c
>>>> @@ -3275,7 +3275,8 @@ static int mmc_blk_probe(struct mmc_card *card)
>>>>         mmc_fixup_device(card, mmc_blk_fixups);
>>>>
>>>>         card->complete_wq = alloc_workqueue("mmc_complete",
>>>> -                                       WQ_MEM_RECLAIM | WQ_HIGHPRI, 0);
>>>> +                                       WQ_MEM_RECLAIM | WQ_HIGHPRI | WQ_PERCPU,
>>>> +                                       0);
>>>
>>> I guess we prefer to keep the existing behaviour to avoid breaking
>>> anything, before continuing with the refactoring. Although I think it
>>> should be fine to use WQ_UNBOUND here.
>>>
>>> Looping in Adrian to get his opinion around this.
>>
>> Typically the work is being queued from the CPU that received the
>> interrupt.  I'd assume, running the work on that CPU, as we do now,
>> has some merit.
>>
> 
> Thanks, I get your point!
> 
> So, to me it seems like if we want to explore other options, it would
> require us to do more analysis to avoid introducing performance
> regressions.
> 
> BTW, do we know how other block device drivers are dealing with this?

AFAIK, call blk_mq_complete_request() from the interrupt handler.
mmc_block does that in the case of CQE or HSQ.