linux-kernel - Re: [PATCH RESEND] lib/group_cpus: make group CPU cluster aware

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <a101fe80-ca0b-4b4b-94b1-f08db1b164fc@intel.com>
Date: Wed, 12 Nov 2025 11:02:47 +0800
From: "Guo, Wangyang" <wangyang.guo@...el.com>
To: Ming Lei <ming.lei@...hat.com>
Cc: Andrew Morton <akpm@...ux-foundation.org>,
 Thomas Gleixner <tglx@...utronix.de>, Keith Busch <kbusch@...nel.org>,
 Jens Axboe <axboe@...com>, Christoph Hellwig <hch@....de>,
 Sagi Grimberg <sagi@...mberg.me>, linux-kernel@...r.kernel.org,
 linux-nvme@...ts.infradead.org, virtualization@...ts.linux-foundation.org,
 linux-block@...r.kernel.org, Tianyou Li <tianyou.li@...el.com>,
 Tim Chen <tim.c.chen@...ux.intel.com>, Dan Liang <dan.liang@...el.com>
Subject: Re: [PATCH RESEND] lib/group_cpus: make group CPU cluster aware

On 11/11/2025 8:08 PM, Ming Lei wrote:
> On Tue, Nov 11, 2025 at 01:31:04PM +0800, Guo, Wangyang wrote:
>> On 11/11/2025 11:25 AM, Ming Lei wrote:
>>> On Tue, Nov 11, 2025 at 10:06:08AM +0800, Wangyang Guo wrote:
>>>> As CPU core counts increase, the number of NVMe IRQs may be smaller than
>>>> the total number of CPUs. This forces multiple CPUs to share the same
>>>> IRQ. If the IRQ affinity and the CPU’s cluster do not align, a
>>>> performance penalty can be observed on some platforms.
>>>
>>> Can you add details why/how CPU cluster isn't aligned with IRQ
>>> affinity? And how performance penalty is caused?
>>
>> Intel Xeon E platform packs 4 CPU cores as 1 module (cluster) and share the
>> L2 cache. Let's say, if there are 40 CPUs in 1 NUMA domain and 11 IRQs to
>> dispatch. The existing algorithm will map first 7 IRQs each with 4 CPUs and
>> remained 4 IRQs each with 3 CPUs each. The last 4 IRQs may have cross
>> cluster issue. For example, the 9th IRQ which pinned to CPU32, then for
>> CPU31, it will have cross L2 memory access.
> 
> 
> CPUs sharing L2 usually have small number, and it is common to see one queue
> mapping includes CPUs from different L2.
> 
> So how much does crossing L2 hurt IO perf?
We see 15%+ performance difference in FIO libaio/randread/bs=8k.

> They still should share same L3 cache, and cpus_share_cache() should be
> true when the IO completes on the CPU which belong to different L2 with the
> submission CPU, and remote completion via IPI won't be triggered.
Yes, remote IPI not triggered.


BR
Wangyang