lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20230726131408.GA15909@lst.de>
Date:   Wed, 26 Jul 2023 15:14:08 +0200
From:   Christoph Hellwig <hch@....de>
To:     Sagi Grimberg <sagi@...mberg.me>
Cc:     Keith Busch <kbusch@...nel.org>,
        Pratyush Yadav <ptyadav@...zon.de>,
        Jens Axboe <axboe@...nel.dk>, Christoph Hellwig <hch@....de>,
        linux-nvme@...ts.infradead.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH] nvme-pci: do not set the NUMA node of device if it has
 none

On Wed, Jul 26, 2023 at 10:58:36AM +0300, Sagi Grimberg wrote:
>>> For example, AWS EC2's i3.16xlarge instance does not expose NUMA
>>> information for the NVMe devices. This means all NVMe devices have
>>> NUMA_NO_NODE by default. Without this patch, random 4k read performance
>>> measured via fio on CPUs from node 1 (around 165k IOPS) is almost 50%
>>> less than CPUs from node 0 (around 315k IOPS). With this patch, CPUs on
>>> both nodes get similar performance (around 315k IOPS).
>>
>> irqbalance doesn't work with this driver though: the interrupts are
>> managed by the kernel. Is there some other reason to explain the perf
>> difference?
>
> Maybe its because the numa_node goes to the tagset which allocates
> stuff based on that numa-node ?

Yeah, the only explanation I could come up with is that without this
the allocations gets spread, and that somehow helps.  All of this
is a little obscure, but so is the NVMe practice of setting the node id
to first_memory_node, which no other driver does.  I'd really like to
understand what's going on here first.  After that this patch probably
is the right thing, I'd just like to understand why.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ