[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <529122c4-a704-4d3a-8ec0-98552e7a87a2@kernel.org>
Date: Thu, 6 Mar 2025 11:14:33 +0100
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: arthur@...hurfabre.com, netdev@...r.kernel.org, bpf@...r.kernel.org
Cc: jakub@...udflare.com, yan@...udflare.com, jbrandeburg@...udflare.com,
thoiland@...hat.com, lbiancon@...hat.com,
Arthur Fabre <afabre@...udflare.com>
Subject: Re: [PATCH RFC bpf-next 06/20] trait: Replace memmove calls with
inline move
On 05/03/2025 15.32, arthur@...hurfabre.com wrote:
> From: Arthur Fabre <afabre@...udflare.com>
>
> When inserting or deleting traits, we need to move any subsequent
> traits over.
>
> Replace it with an inline implementation to avoid the function call
> overhead. This is especially expensive on AMD with SRSO.
>
> In practice we shouldn't have too much data to move around, and we're
> naturally limited to 238 bytes max, so a dumb implementation should
> hopefully be fast enough.
>
> Jesper Brouer kindly ran benchmarks on real hardware with three configs:
> - Intel: E5-1650 v4
> - AMD SRSO: 9684X SRSO
> - AMD IBPB: 9684X SRSO=IBPB
>
> Intel AMD IBPB AMD SRSO
> xdp-trait-get 5.530 3.901 9.188 (ns/op)
> xdp-trait-set 7.538 4.941 10.050 (ns/op)
> xdp-trait-move 14.245 8.865 14.834 (ns/op)
> function call 1.319 1.359 5.703 (ns/op)
> indirect call 8.922 6.251 10.329 (ns/op)
>
I've done extensive *micro* bechmarking documented here:
- https://github.com/xdp-project/xdp-project/tree/main/areas/hints
- In traits0X_* files
The latest that corresponds to this patchset is in this file:
-
https://github.com/xdp-project/xdp-project/blob/main/areas/hints/traits07_bench-009.org
I've not done XDP_REDIRECT testing, which would likely show the bitfield
change in xdp_frame, that Olek pointed out.
--Jesper
Powered by blists - more mailing lists