netdev - Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <2d2f76b5-6af6-b6f0-5c05-cc24cb355fe8@iogearbox.net>
Date: Tue, 17 Oct 2023 11:06:06 +0200
From: Daniel Borkmann <daniel@...earbox.net>
To: Florian Fainelli <f.fainelli@...il.com>, Coco Li <lixiaoyan@...gle.com>,
 Jakub Kicinski <kuba@...nel.org>, Eric Dumazet <edumazet@...gle.com>,
 Neal Cardwell <ncardwell@...gle.com>,
 Mubashir Adnan Qureshi <mubashirq@...gle.com>,
 Paolo Abeni <pabeni@...hat.com>
Cc: netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>,
 Wei Wang <weiwan@...gle.com>
Subject: Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking
 Structs to optimize cacheline consumption

On 10/17/23 5:46 AM, Florian Fainelli wrote:
> On 10/16/2023 6:47 PM, Coco Li wrote:
>> Currently, variable-heavy structs in the networking stack is organized
>> chronologically, logically and sometimes by cache line access.
>>
>> This patch series attempts to reorganize the core networking stack
>> variables to minimize cacheline consumption during the phase of data
>> transfer. Specifically, we looked at the TCP/IP stack and the fast
>> path definition in TCP.
>>
>> For documentation purposes, we also added new files for each core data
>> structure we considered, although not all ended up being modified due
>> to the amount of existing cache line they span in the fast path. In
>> the documentation, we recorded all variables we identified on the
>> fast path and the reasons. We also hope that in the future when
>> variables are added/modified, the document can be referred to and
>> updated accordingly to reflect the latest variable organization.
> 
> This is great stuff, while Eric mentioned this work during Netconf'23 one concern that came up however is how can we make sure that a future change which adds/removes/shuffles members in those structures is not going to be detrimental to the work you just did? Is there a way to "lock" the structure layout to avoid causing performance drops?
> 
> I suppose we could use pahole before/after for these structures and ensure that the layout on a cacheline basis remains preserved, but that means adding custom scripts to CI.

It should be possible without extra CI. We could probably have zero-sized markers
as we have in sk_buff e.g. __cloned_offset[0], and use some macros to force grouping.

ASSERT_CACHELINE_GROUP() could then throw a build error for example if the member is
not within __begin_cacheline_group and __end_cacheline_group :

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 9ea3ec906b57..c664e0594da4 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -2059,6 +2059,7 @@ struct net_device {
          */

         /* TX read-mostly hotpath */
+       __begin_cacheline_group(tx_read_mostly);
         unsigned long long      priv_flags;
         const struct net_device_ops *netdev_ops;
         const struct header_ops *header_ops;
@@ -2085,6 +2086,7 @@ struct net_device {
  #ifdef CONFIG_NET_XGRESS
         struct bpf_mprog_entry __rcu *tcx_egress;
  #endif
+       __end_cacheline_group(tx_read_mostly);

         /* TXRX read-mostly hotpath */
         unsigned int            flags;
diff --git a/net/core/dev.c b/net/core/dev.c
index 97e7b9833db9..2a91bd4077ad 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -11523,6 +11523,9 @@ static int __init net_dev_init(void)

         BUG_ON(!dev_boot_phase);

+       ASSERT_CACHELINE_GROUP(tx_read_mostly, priv_flags);
+       ASSERT_CACHELINE_GROUP(tx_read_mostly, netdev_ops);
+       [...]
+
         if (dev_proc_init())
                 goto out;