lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <78b2d2ad-4e0e-41b7-95b4-b7fe945dfe13@kernel.org>
Date: Thu, 3 Apr 2025 11:37:19 +0200
From: Jesper Dangaard Brouer <hawk@...nel.org>
To: Jakub Kicinski <kuba@...nel.org>, Jiayuan Chen <jiayuan.chen@...ux.dev>
Cc: bpf@...r.kernel.org, mrpre@....com, Alexei Starovoitov <ast@...nel.org>,
 Daniel Borkmann <daniel@...earbox.net>, Andrii Nakryiko <andrii@...nel.org>,
 Martin KaFai Lau <martin.lau@...ux.dev>, Eduard Zingerman
 <eddyz87@...il.com>, Song Liu <song@...nel.org>,
 Yonghong Song <yonghong.song@...ux.dev>,
 John Fastabend <john.fastabend@...il.com>, KP Singh <kpsingh@...nel.org>,
 Stanislav Fomichev <sdf@...ichev.me>, Hao Luo <haoluo@...gle.com>,
 Jiri Olsa <jolsa@...nel.org>, "David S. Miller" <davem@...emloft.net>,
 Eric Dumazet <edumazet@...gle.com>, Paolo Abeni <pabeni@...hat.com>,
 Simon Horman <horms@...nel.org>, Mykola Lysenko <mykolal@...com>,
 Shuah Khan <shuah@...nel.org>, Willem de Bruijn <willemb@...gle.com>,
 Jason Xing <kerneljasonxing@...il.com>,
 Anton Protopopov <aspsk@...valent.com>,
 Abhishek Chauhan <quic_abchauha@...cinc.com>,
 Jordan Rome <linux@...danrome.com>,
 Martin Kelly <martin.kelly@...wdstrike.com>,
 David Lechner <dlechner@...libre.com>, linux-kernel@...r.kernel.org,
 netdev@...r.kernel.org, linux-kselftest@...r.kernel.org,
 kernel-team <kernel-team@...udflare.com>
Subject: Re: [PATCH bpf v2 2/2] selftests/bpf: add perf test for
 adjust_{head,meta}



On 03/04/2025 02.24, Jakub Kicinski wrote:
> On Mon, 31 Mar 2025 11:23:45 +0800 Jiayuan Chen wrote:
>> which is negligible for the net stack.
>>
>> Before memset
>> ./test_progs -a xdp_adjust_head_perf -v
>> run adjust head with size 6 cost 56 ns
>> run adjust head with size 20 cost 56 ns
>> run adjust head with size 40 cost 56 ns
>> run adjust head with size 200 cost 56 ns
>>
>> After memset
>> ./test_progs -a xdp_adjust_head_perf -v
>> run adjust head with size 6 cost 58 ns
>> run adjust head with size 20 cost 58 ns
>> run adjust head with size 40 cost 58 ns
>> run adjust head with size 200 cost 66 ns
> 
> FWIW I'm not sure if this is "negligible" for XDP like you say,
> but I defer to Jesper :)

It would be too much for the XDP_DROP use-case, e.g. DDoS protection and
driver hardware eval. But this is changing a BPF-helper, which means it
is opt-in by the BPF-programmer.  Thus, we can accept larger perf
overhead here.

I suspect your 2 nanosec overhead primarily comes from the function call
overhead.  On my AMD production system with SRSO mitigation enabled I
expect to see around 6 ns overhead (5.699 ns), which is sad.

I've done a lot of benchmarking of memset (see [1]). One take-away is
that memset with small const values will get compiled into very fast
code that avoids the function call (basically QWORD MOVs).  E.g. memset
with const 32 is extremely fast[2], on my system it takes 0.673 ns (and
0.562 ns comes from for-loop overhead).  Thus, it is possible to do
something faster, as we are clearing very small values. I.e. using a
duff's device construct like I did for remainder in [2].

In this case, as this is a BPF-helper, I am uncertain if it is worth the
complexity to add such optimizations... I guess not.
This turned into a long way of saying, I'm okay with this change.

[1] 
https://github.com/netoptimizer/prototype-kernel/blob/master/kernel/lib/time_bench_memset.c

[2] 
https://github.com/netoptimizer/prototype-kernel/blob/35b1716d0c300e7fa2c8b6d8cfed2ec81df8f3a4/kernel/lib/time_bench_memset.c#L520-L521

--Jesper

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ