lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date: Thu, 13 Jul 2023 07:59:32 +0200
From: Heiner Kallweit <hkallweit1@...il.com>
To: Anuj Gupta <anuj20.g@...sung.com>, davem@...emloft.net
Cc: holger@...lied-asynchrony.com, kai.heng.feng@...onical.com,
 simon.horman@...igine.com, nic_swsd@...ltek.com, netdev@...r.kernel.org,
 linux-nvme@...ts.infradead.org
Subject: Re: Performance Regression due to ASPM disable patch

On 12.07.2023 17:55, Anuj Gupta wrote:
> Hi,
> 
> I see a performance regression for read/write workloads on our NVMe over
> fabrics using TCP as transport setup.
> IOPS drop by 23% for 4k-randread [1] and by 18% for 4k-randwrite [2].
> 
> I bisected and found that the commit
> e1ed3e4d91112027b90c7ee61479141b3f948e6a ("r8169: disable ASPM during
> NAPI poll") is the trigger.
> When I revert this commit, the performance drop goes away.
> 
> The target machine uses a realtek ethernet controller - 
> root@...tpc:/home/test# lspci | grep -i eth
> 29:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Device 2600
> (rev 21)
> 2a:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. Killer
> E3000 2.5GbE Controller (rev 03)
> 
> I tried to disable aspm by passing "pcie_aspm=off" as boot parameter and
> by setting pcie aspm policy to performance. But it didn't improve the
> performance.
> I wonder if this is already known, and something different should be
> done to handle the original issue? 
> 
> [1] fio randread
> fio -direct=1 -iodepth=1 -rw=randread -ioengine=psync -bs=4k -numjobs=1
> -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> -output=psync_read
> [2] fio randwrite
> fio -direct=1 -iodepth=1 -rw=randwrite -ioengine=psync -bs=4k -numjobs=1
> -runtime=30 -group_reporting -filename=/dev/nvme1n1 -name=psync_read
> -output=psync_write
> 
> 
I can imagine a certain performance impact of this commit if there are
lots of small packets handled by individual NAPI polls.
Maybe it's also chip version specific.
You have two NIC's, do you see the issue with both of them?
Related: What's your line speed, 1Gbps or 2.5Gbps?
Can you reproduce the performance impact with iperf?
Do you use any network optimization settings for latency vs. performance?
Interrupt coalescing, is TSO(6) enabled?
An ethtool -k output may provide further insight.


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ