lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [thread-next>] [day] [month] [year] [list]
Message-ID: <166393728005.2213882.4162674859542409548.stgit@firesoul>
Date:   Fri, 23 Sep 2022 14:48:00 +0200
From:   Jesper Dangaard Brouer <brouer@...hat.com>
To:     netdev@...r.kernel.org
Cc:     Jesper Dangaard Brouer <brouer@...hat.com>,
        Jakub Kicinski <kuba@...nel.org>,
        John Fastabend <john.fastabend@...il.com>,
        "David S. Miller" <davem@...emloft.net>, ast@...nel.org,
        hawk@...nel.org, daniel@...earbox.net, edumazet@...gle.com,
        pabeni@...hat.com, bpf@...r.kernel.org,
        Lorenzo Bianconi <lorenzo@...nel.org>
Subject: [PATCH net-next] xdp: Adjust xdp_frame layout to avoid using
 bitfields

Practical experience (and advice from Alexei) tell us that bitfields in
structs lead to un-optimized assemply code. I've verified this change
does lead to better x86_64 assemply, both via objdump and playing with
code snippets in godbolt.org.

Using scripts/bloat-o-meter shows the code size is reduced with 24
bytes for xdp_convert_buff_to_frame() that gets inlined e.g. in
i40e_xmit_xdp_tx_ring() which were used for microbenchmarking.

Microbenchmarking results do show improvements, but very small and
varying between 0.5 to 2 nanosec improvement per packet.

The member @metasize is changed from u8 to u32. Future users of this
area could split this into two u16 fields. I've also benchmarked with
two u16 fields showing equal performance gains and code size reduction.

The moved member @frame_sz doesn't change sizeof struct due to existing
padding. Like xdp_buff member @frame_sz is placed next to @flags, which
allows compiler to optimize assignment of these.

Signed-off-by: Jesper Dangaard Brouer <brouer@...hat.com>
---
 include/net/xdp.h |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/net/xdp.h b/include/net/xdp.h
index 04c852c7a77f..55dbc68bfffc 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -164,13 +164,13 @@ struct xdp_frame {
 	void *data;
 	u16 len;
 	u16 headroom;
-	u32 metasize:8;
-	u32 frame_sz:24;
+	u32 metasize; /* uses lower 8-bits */
 	/* Lifetime of xdp_rxq_info is limited to NAPI/enqueue time,
 	 * while mem info is valid on remote CPU.
 	 */
 	struct xdp_mem_info mem;
 	struct net_device *dev_rx; /* used by cpumap */
+	u32 frame_sz;
 	u32 flags; /* supported values defined in xdp_buff_flags */
 };
 


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ