[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <20231017014716.3944813-1-lixiaoyan@google.com>
Date: Tue, 17 Oct 2023 01:47:11 +0000
From: Coco Li <lixiaoyan@...gle.com>
To: Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
Neal Cardwell <ncardwell@...gle.com>, Mubashir Adnan Qureshi <mubashirq@...gle.com>,
Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>, Wei Wang <weiwan@...gle.com>,
Coco Li <lixiaoyan@...gle.com>
Subject: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking
Structs to optimize cacheline consumption
Currently, variable-heavy structs in the networking stack is organized
chronologically, logically and sometimes by cache line access.
This patch series attempts to reorganize the core networking stack
variables to minimize cacheline consumption during the phase of data
transfer. Specifically, we looked at the TCP/IP stack and the fast
path definition in TCP.
For documentation purposes, we also added new files for each core data
structure we considered, although not all ended up being modified due
to the amount of existing cache line they span in the fast path. In
the documentation, we recorded all variables we identified on the
fast path and the reasons. We also hope that in the future when
variables are added/modified, the document can be referred to and
updated accordingly to reflect the latest variable organization.
Tested:
Our tests were run with neper tcp_rr using tcp traffic. The tests have $cpu
number of threads and variable number of flows (see below).
Tests were run on 6.5-rc1
Efficiency is computed as cpu seconds / throughput (one tcp_rr round trip).
The following result shows efficiency delta before and after the patch
series is applied.
On AMD platforms with 100Gb/s NIC and 256Mb L3 cache:
IPv4
Flows with patches clean kernel Percent reduction
30k 0.0001736538065 0.0002741191042 -36.65%
20k 0.0001583661752 0.0002712559158 -41.62%
10k 0.0001639148817 0.0002951800751 -44.47%
5k 0.0001859683866 0.0003320642536 -44.00%
1k 0.0002035190546 0.0003152056382 -35.43%
IPv6
Flows with patches clean kernel Percent reduction
30k 0.000202535503 0.0003275329163 -38.16%
20k 0.0002020654777 0.0003411304786 -40.77%
10k 0.0002122427035 0.0003803674705 -44.20%
5k 0.0002348776729 0.0004030403953 -41.72%
1k 0.0002237384583 0.0002813646157 -20.48%
On Intel platforms with 200Gb/s NIC and 105Mb L3 cache:
IPv6
Flows with patches clean kernel Percent reduction
30k 0.0006296537873 0.0006370427753 -1.16%
20k 0.0003451029365 0.0003628016076 -4.88%
10k 0.0003187646958 0.0003346835645 -4.76%
5k 0.0002954676348 0.000311807592 -5.24%
1k 0.0001909169342 0.0001848069709 3.31%
Chao Wu (1):
net-smnp: reorganize SNMP fast path variables
Coco Li (4):
Documentations: Analyze heavily used Networking related structs
netns-ipv4: reorganize netns_ipv4 fast path variables
net-device: reorganize net_device fast path variables
tcp: reorganize tcp_sock fast path variables
.../net_cachelines/inet_connection_sock.rst | 42 ++++
.../networking/net_cachelines/inet_sock.rst | 37 +++
.../networking/net_cachelines/net_device.rst | 167 ++++++++++++
.../net_cachelines/netns_ipv4_sysctl.rst | 151 +++++++++++
.../networking/net_cachelines/snmp.rst | 128 ++++++++++
.../networking/net_cachelines/tcp_sock.rst | 148 +++++++++++
include/linux/netdevice.h | 99 ++++----
include/linux/tcp.h | 238 +++++++++---------
include/net/netns/ipv4.h | 41 +--
include/uapi/linux/snmp.h | 34 ++-
10 files changed, 896 insertions(+), 189 deletions(-)
create mode 100644 Documentation/networking/net_cachelines/inet_connection_sock.rst
create mode 100644 Documentation/networking/net_cachelines/inet_sock.rst
create mode 100644 Documentation/networking/net_cachelines/net_device.rst
create mode 100644 Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst
create mode 100644 Documentation/networking/net_cachelines/snmp.rst
create mode 100644 Documentation/networking/net_cachelines/tcp_sock.rst
--
2.42.0.655.g421f12c284-goog
Powered by blists - more mailing lists