lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <20231030052550.3157719-1-lixiaoyan@google.com> Date: Mon, 30 Oct 2023 05:25:45 +0000 From: Coco Li <lixiaoyan@...gle.com> To: Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>, Neal Cardwell <ncardwell@...gle.com>, Mubashir Adnan Qureshi <mubashirq@...gle.com>, Paolo Abeni <pabeni@...hat.com>, Andrew Lunn <andrew@...n.ch>, Jonathan Corbet <corbet@....net>, David Ahern <dsahern@...nel.org>, Daniel Borkmann <daniel@...earbox.net> Cc: netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>, Wei Wang <weiwan@...gle.com>, Pradeep Nemavat <pnemavat@...gle.com>, Coco Li <lixiaoyan@...gle.com> Subject: [PATCH v6 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption Currently, variable-heavy structs in the networking stack is organized chronologically, logically and sometimes by cacheline access. This patch series attempts to reorganize the core networking stack variables to minimize cacheline consumption during the phase of data transfer. Specifically, we looked at the TCP/IP stack and the fast path definition in TCP. For documentation purposes, we also added new files for each core data structure we considered, although not all ended up being modified due to the amount of existing cacheline they span in the fast path. In the documentation, we recorded all variables we identified on the fast path and the reasons. We also hope that in the future when variables are added/modified, the document can be referred to and updated accordingly to reflect the latest variable organization. Tested: Our tests were run with neper tcp_rr using tcp traffic. The tests have $cpu number of threads and variable number of flows (see below). Tests were run on 6.5-rc1 Efficiency is computed as cpu seconds / throughput (one tcp_rr round trip). The following result shows efficiency delta before and after the patch series is applied. On AMD platforms with 100Gb/s NIC and 256Mb L3 cache: IPv4 Flows with patches clean kernel Percent reduction 30k 0.0001736538065 0.0002741191042 -36.65% 20k 0.0001583661752 0.0002712559158 -41.62% 10k 0.0001639148817 0.0002951800751 -44.47% 5k 0.0001859683866 0.0003320642536 -44.00% 1k 0.0002035190546 0.0003152056382 -35.43% IPv6 Flows with patches clean kernel Percent reduction 30k 0.000202535503 0.0003275329163 -38.16% 20k 0.0002020654777 0.0003411304786 -40.77% 10k 0.0002122427035 0.0003803674705 -44.20% 5k 0.0002348776729 0.0004030403953 -41.72% 1k 0.0002237384583 0.0002813646157 -20.48% On Intel platforms with 200Gb/s NIC and 105Mb L3 cache: IPv6 Flows with patches clean kernel Percent reduction 30k 0.0006296537873 0.0006370427753 -1.16% 20k 0.0003451029365 0.0003628016076 -4.88% 10k 0.0003187646958 0.0003346835645 -4.76% 5k 0.0002954676348 0.000311807592 -5.24% 1k 0.0001909169342 0.0001848069709 3.31% v5 changes: 1) removed snmp patch changes for next net-dev cycle. Pending work: move file out of uapi. 2) updated cache group size requirements. Chosen to not use cachelines but actual sum of struct member sizes to not make assumptions on cacheline sizes. v6 changes: 1) fixed one comment. Coco Li (5): Documentations: Analyze heavily used Networking related structs cache: enforce cache groups netns-ipv4: reorganize netns_ipv4 fast path variables net-device: reorganize net_device fast path variables tcp: reorganize tcp_sock fast path variables Documentation/networking/index.rst | 1 + .../networking/net_cachelines/index.rst | 13 + .../net_cachelines/inet_connection_sock.rst | 47 ++++ .../networking/net_cachelines/inet_sock.rst | 41 +++ .../networking/net_cachelines/net_device.rst | 175 ++++++++++++ .../net_cachelines/netns_ipv4_sysctl.rst | 155 +++++++++++ .../networking/net_cachelines/snmp.rst | 132 ++++++++++ .../networking/net_cachelines/tcp_sock.rst | 154 +++++++++++ include/linux/cache.h | 25 ++ include/linux/netdevice.h | 117 +++++---- include/linux/tcp.h | 248 ++++++++++-------- include/net/netns/ipv4.h | 47 ++-- net/core/dev.c | 56 ++++ net/core/net_namespace.c | 43 +++ net/ipv4/tcp.c | 93 +++++++ 15 files changed, 1165 insertions(+), 182 deletions(-) create mode 100644 Documentation/networking/net_cachelines/index.rst create mode 100644 Documentation/networking/net_cachelines/inet_connection_sock.rst create mode 100644 Documentation/networking/net_cachelines/inet_sock.rst create mode 100644 Documentation/networking/net_cachelines/net_device.rst create mode 100644 Documentation/networking/net_cachelines/netns_ipv4_sysctl.rst create mode 100644 Documentation/networking/net_cachelines/snmp.rst create mode 100644 Documentation/networking/net_cachelines/tcp_sock.rst -- 2.42.0.820.g83a721a137-goog
Powered by blists - more mailing lists