linux-kernel - Re: [RFC PATCH] workqueue: Automatic affinity scope fallback for single-pod topologies

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <d32b00cf-afe4-47ac-95e1-1c4321aeb7f2@kernel.org>
Date: Tue, 3 Feb 2026 15:14:46 -0500
From: Chuck Lever <cel@...nel.org>
To: Tejun Heo <tj@...nel.org>
Cc: jiangshanlai@...il.com, linux-kernel@...r.kernel.org,
 Chuck Lever <chuck.lever@...cle.com>
Subject: Re: [RFC PATCH] workqueue: Automatic affinity scope fallback for
 single-pod topologies

On 2/3/26 2:10 PM, Tejun Heo wrote:
> Hello,
> 
> On Tue, Feb 03, 2026 at 09:37:44AM -0500, Chuck Lever wrote:
>> On such systems WQ_AFFN_CACHE, WQ_AFFN_SMT, and WQ_AFFN_NUMA scopes all
>> collapse to a single pod.
> 
> WQ_AFFN_SMT should be on CPU core boundaries, right?
> 
>> Add wq_effective_affn_scope() to detect when a selected affinity scope
>> provides only one pod despite having multiple CPUs, and automatically
>> fall back to a finer-grained scope. This ensures reasonable lock
>> distribution without requiring manual configuration via the
>> workqueue.default_affinity_scope parameter or per-workqueue sysfs
>> tuning.
>>
>> The fallback is conservative: it triggers only when a scope degenerates
>> to exactly one pod, and respects explicitly configured (non-default)
>> scopes.
> 
> While I understand the problem, I don't think dropping down to core boundary
> for unbound workqueues by default makes sense. That may help with some use
> cases but cause problem with others.

I've never seen a case where it doesn't help. In order to craft an
alternative, I'll need to have some examples to avoid. Is it only the
SMT case that is concerning?


> Given that WQ_AFFN_CACHE is the same as
> WQ_AFFN_NUMA on these machines, maybe we can shard it automatically
> according to some heuristics or maybe we can introduce another affinity
> level between CACHE and SMT which is sharded on machines with too many CPUs
> in a single cache domain.


-- 
Chuck Lever