lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <CALkn8kLOozs5UO52SQa9PR-CiKx_mqW8VF9US94qN+ixyqnkdQ@mail.gmail.com>
Date: Wed, 28 Feb 2024 12:13:27 +0530
From: Abdul Anshad Azeez <abdul-anshad.azeez@...adcom.com>
To: edumazet@...gle.com, davem@...emloft.net, kuba@...nel.org, 
	pabeni@...hat.com, corbet@....net, dsahern@...nel.org, netdev@...r.kernel.org, 
	linux-kernel@...r.kernel.org
Cc: Boon Ang <boon.ang@...adcom.com>, John Savanyo <john.savanyo@...adcom.com>, 
	Peter Jonasson <peter.jonasson@...adcom.com>, Rajender M <rajender.m@...adcom.com>
Subject: Network performance regression in Linux kernel 6.6 for small socket
 size test cases

During performance regression workload execution of the Linux
kernel we observed up to 30% performance decrease in a specific networking
workload on the 6.6 kernel compared to 6.5 (details below). The regression is
reproducible in both Linux VMs running on ESXi and bare metal Linux.

Workload details:

Benchmark - Netperf TCP_STREAM
Socket buffer size - 8K
Message size - 256B
MTU - 1500B
Socket option - TCP_NODELAY
# of STREAMs - 32
Direction - Uni-Directional Receive
Duration - 60 Seconds
NIC - Mellanox Technologies ConnectX-6 Dx EN 100G
Server Config - Intel(R) Xeon(R) Gold 6348 CPU @ 2.60GHz & 512G Memory

Bisect between 6.5 and 6.6 kernel concluded that this regression originated
from the below commit:

commit - dfa2f0483360d4d6f2324405464c9f281156bd87 (tcp: get rid of
sysctl_tcp_adv_win_scale)
Author - Eric Dumazet <edumazet@...gle.com>
Link -
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=
dfa2f0483360d4d6f2324405464c9f281156bd87

Performance data for (Linux VM on ESXi):
Test case - TCP_STREAM_RECV Throughput in Gbps
(for different socket buffer sizes and with constant message size - 256B):

Socket buffer size - [LK6.5 vs LK6.6]
8K - [8.4 vs 5.9 Gbps]
16K - [13.4 vs 10.6 Gbps]
32K - [19.1 vs 16.3 Gbps]
64K - [19.6 vs 19.7 Gbps]
Autotune - [19.7 vs 19.6 Gbps]

>From the above performance data, we can infer that:
* Regression is specific to lower fixed socket buffer sizes (8K, 16K & 32K).
* Increasing the socket buffer size gradually decreases the throughput impact.
* Performance is equal for higher fixed socket size (64K) and Autotune socket
tests.

We would like to know if there are any opportunities for optimization in
the test cases with small socket sizes.

Abdul Anshad Azeez
Performance Engineering
Broadcom Inc.

-- 
This electronic communication and the information and any files transmitted 
with it, or attached to it, are confidential and are intended solely for 
the use of the individual or entity to whom it is addressed and may contain 
information that is confidential, legally privileged, protected by privacy 
laws, or otherwise restricted from disclosure to anyone else. If you are 
not the intended recipient or the person responsible for delivering the 
e-mail to the intended recipient, you are hereby notified that any use, 
copying, distributing, dissemination, forwarding, printing, or copying of 
this e-mail is strictly prohibited. If you received this e-mail in error, 
please return the e-mail to the sender, delete it from your computer, and 
destroy any printed copy of it.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ