lists.openwall.net | lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC | |
Open Source and information security mailing list archives
| ||
|
Message-ID: <da79e5cf-a004-b1e2-9a91-deb709ca0302@iogearbox.net> Date: Tue, 17 Oct 2023 19:07:04 +0200 From: Daniel Borkmann <daniel@...earbox.net> To: Eric Dumazet <edumazet@...gle.com> Cc: Florian Fainelli <f.fainelli@...il.com>, Coco Li <lixiaoyan@...gle.com>, Jakub Kicinski <kuba@...nel.org>, Neal Cardwell <ncardwell@...gle.com>, Mubashir Adnan Qureshi <mubashirq@...gle.com>, Paolo Abeni <pabeni@...hat.com>, netdev@...r.kernel.org, Chao Wu <wwchao@...gle.com>, Wei Wang <weiwan@...gle.com> Subject: Re: [PATCH v2 net-next 0/5] Analyze and Reorganize core Networking Structs to optimize cacheline consumption On 10/17/23 6:50 PM, Eric Dumazet wrote: > On Tue, Oct 17, 2023 at 11:06 AM Daniel Borkmann <daniel@...earbox.net> wrote: >> On 10/17/23 5:46 AM, Florian Fainelli wrote: >>> On 10/16/2023 6:47 PM, Coco Li wrote: >>>> Currently, variable-heavy structs in the networking stack is organized >>>> chronologically, logically and sometimes by cache line access. >>>> >>>> This patch series attempts to reorganize the core networking stack >>>> variables to minimize cacheline consumption during the phase of data >>>> transfer. Specifically, we looked at the TCP/IP stack and the fast >>>> path definition in TCP. >>>> >>>> For documentation purposes, we also added new files for each core data >>>> structure we considered, although not all ended up being modified due >>>> to the amount of existing cache line they span in the fast path. In >>>> the documentation, we recorded all variables we identified on the >>>> fast path and the reasons. We also hope that in the future when >>>> variables are added/modified, the document can be referred to and >>>> updated accordingly to reflect the latest variable organization. >>> >>> This is great stuff, while Eric mentioned this work during Netconf'23 one concern that came up however is how can we make sure that a future change which adds/removes/shuffles members in those structures is not going to be detrimental to the work you just did? Is there a way to "lock" the structure layout to avoid causing performance drops? >>> >>> I suppose we could use pahole before/after for these structures and ensure that the layout on a cacheline basis remains preserved, but that means adding custom scripts to CI. >> >> It should be possible without extra CI. We could probably have zero-sized markers >> as we have in sk_buff e.g. __cloned_offset[0], and use some macros to force grouping. >> >> ASSERT_CACHELINE_GROUP() could then throw a build error for example if the member is >> not within __begin_cacheline_group and __end_cacheline_group : >> >> diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h >> index 9ea3ec906b57..c664e0594da4 100644 >> --- a/include/linux/netdevice.h >> +++ b/include/linux/netdevice.h >> @@ -2059,6 +2059,7 @@ struct net_device { >> */ >> >> /* TX read-mostly hotpath */ >> + __begin_cacheline_group(tx_read_mostly); >> unsigned long long priv_flags; >> const struct net_device_ops *netdev_ops; >> const struct header_ops *header_ops; >> @@ -2085,6 +2086,7 @@ struct net_device { >> #ifdef CONFIG_NET_XGRESS >> struct bpf_mprog_entry __rcu *tcx_egress; >> #endif >> + __end_cacheline_group(tx_read_mostly); >> >> /* TXRX read-mostly hotpath */ >> unsigned int flags; >> diff --git a/net/core/dev.c b/net/core/dev.c >> index 97e7b9833db9..2a91bd4077ad 100644 >> --- a/net/core/dev.c >> +++ b/net/core/dev.c >> @@ -11523,6 +11523,9 @@ static int __init net_dev_init(void) >> >> BUG_ON(!dev_boot_phase); >> >> + ASSERT_CACHELINE_GROUP(tx_read_mostly, priv_flags); >> + ASSERT_CACHELINE_GROUP(tx_read_mostly, netdev_ops); nit, should have been sth like: ASSERT_CACHELINE_GROUP(struct net_device, netdev_ops, tx_read_mostly) > Great idea, we only need to generate these automatically from the file > describing the fields (currently in Documentation/ ) > > I think the initial intent was to find a way to generate the layout of > the structure itself, but this looked a bit tricky. Agree, ideally this could be scripted from the Documentation/ file of this series, and perhaps the latter may not even be needed then if we have it self-documented in code behind some macro magic with BUILD_BUG_ON assertion which probes offsetof wrt the field being within markers.
Powered by blists - more mailing lists