linux-kernel - Re: [PATCH bpf-next 4/4] selftests/bpf: Add test for checking correct nop of optimized usdt

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAEf4Bzb1Du11wwUK=qXeWi0V-nN7qc7VsomGiaOM_8eSH2oHtg@mail.gmail.com>
Date: Mon, 24 Nov 2025 09:34:45 -0800
From: Andrii Nakryiko <andrii.nakryiko@...il.com>
To: Jiri Olsa <jolsa@...nel.org>
Cc: Andrii Nakryiko <andrii@...nel.org>, bpf@...r.kernel.org, linux-kernel@...r.kernel.org, 
	Song Liu <songliubraving@...com>, Yonghong Song <yhs@...com>, 
	John Fastabend <john.fastabend@...il.com>
Subject: Re: [PATCH bpf-next 4/4] selftests/bpf: Add test for checking correct
 nop of optimized usdt

On Mon, Nov 17, 2025 at 12:36 AM Jiri Olsa <jolsa@...nel.org> wrote:
>
> Adding test that attaches bpf program on usdt probe in 2 scenarios;
>
> - attach program on top of usdt_1 which is standard nop probe
>   incidentally followed by nop5. The usdt probe does not have
>   extra data in elf note record, so we expect the probe to land
>   on the first nop without being optimized.
>
> - attach program on top of usdt_2 which is probe defined on top
>   of nop,nop5 combo. The extra data in the elf note record and
>   presence of upeobe syscall ensures that the probe is placed
>   on top of nop5 and optimized.
>
> Signed-off-by: Jiri Olsa <jolsa@...nel.org>
> ---
>  tools/testing/selftests/bpf/.gitignore        |  2 +
>  tools/testing/selftests/bpf/Makefile          |  7 +-
>  tools/testing/selftests/bpf/prog_tests/usdt.c | 82 +++++++++++++++++++
>  tools/testing/selftests/bpf/progs/test_usdt.c |  9 ++
>  tools/testing/selftests/bpf/usdt_1.c          | 14 ++++
>  tools/testing/selftests/bpf/usdt_2.c          | 13 +++
>  6 files changed, 126 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/usdt_1.c
>  create mode 100644 tools/testing/selftests/bpf/usdt_2.c
>

can you please add a simple uprobe benchmark so that we can compare
nop1 vs nop5 performance easily? See below, I'd actually force nop1
for existing test with custom USDT_NOP override, and assume nop1+nop5
as a default case for nop5 bench.

> diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
> index be1ee7ba7ce0..89f480729a6b 100644
> --- a/tools/testing/selftests/bpf/.gitignore
> +++ b/tools/testing/selftests/bpf/.gitignore
> @@ -45,3 +45,5 @@ xdp_synproxy
>  xdp_hw_metadata
>  xdp_features
>  verification_cert.h
> +usdt_1
> +usdt_2
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 34ea23c63bd5..4a21657e45f7 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -733,6 +733,10 @@ $(VERIFICATION_CERT) $(PRIVATE_KEY): $(VERIFY_SIG_SETUP)
>  $(VERIFY_SIG_HDR): $(VERIFICATION_CERT)
>         $(Q)xxd -i -n test_progs_verification_cert $< > $@
>
> +ifeq ($(SRCARCH),$(filter $(SRCARCH),x86))
> +USDTX := usdt_1.c usdt_2.c
> +endif
> +

why not compile it unconditionally, why complicating makefile if we
can do x86-64-specific logic in corresponding files?

>  # Define test_progs test runner.
>  TRUNNER_TESTS_DIR := prog_tests
>  TRUNNER_BPF_PROGS_DIR := progs
> @@ -754,7 +758,8 @@ TRUNNER_EXTRA_SOURCES := test_progs.c               \
>                          json_writer.c          \
>                          $(VERIFY_SIG_HDR)              \
>                          flow_dissector_load.h  \
> -                        ip_check_defrag_frags.h
> +                        ip_check_defrag_frags.h \
> +                        $(USDTX)
>  TRUNNER_LIB_SOURCES := find_bit.c
>  TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read                          \
>                        $(OUTPUT)/liburandom_read.so                     \
> diff --git a/tools/testing/selftests/bpf/prog_tests/usdt.c b/tools/testing/selftests/bpf/prog_tests/usdt.c
> index f4be5269fa90..a8ca2920c009 100644
> --- a/tools/testing/selftests/bpf/prog_tests/usdt.c
> +++ b/tools/testing/selftests/bpf/prog_tests/usdt.c
> @@ -247,6 +247,86 @@ static void subtest_basic_usdt(bool optimized)
>  #undef TRIGGER
>  }
>
> +#ifdef __x86_64
> +extern void usdt_1(void);
> +extern void usdt_2(void);
> +
> +/* nop, nop5 */
> +static unsigned char nop15[6] = { 0x90, 0x0f, 0x1f, 0x44, 0x00, 0x00 };

nop15 is a very confusing name for this :) nop1_nop5_combo ?

> +
> +static void *find_nop15(void *fn)
> +{
> +       int i;
> +
> +       for (i = 0; i < 10; i++) {
> +               if (!memcmp(nop15, fn + i, 5))
> +                       return fn + i;
> +       }
> +       return NULL;
> +}
> +

[...]

>  char _license[] SEC("license") = "GPL";
> diff --git a/tools/testing/selftests/bpf/usdt_1.c b/tools/testing/selftests/bpf/usdt_1.c
> new file mode 100644
> index 000000000000..0e00702b1701
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/usdt_1.c
> @@ -0,0 +1,14 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Include usdt.h with defined USDT_NOP macro will switch
> + * off the extra info in usdt probe.
> + */
> +#define USDT_NOP .byte 0x90, 0x0f, 0x1f, 0x44, 0x00, 0x00
> +#include "usdt.h"

upstream usdt.h will use nop1+nop5 on x86-64 unconditionally, so I'd
invert this, and *force* one of the cases to a single nop1 with custom
USDT_NOP, wdyt?

> +
> +__attribute__((aligned(16)))
> +void usdt_1(void)
> +{
> +       USDT(optimized_attach, usdt_1);
> +}
> diff --git a/tools/testing/selftests/bpf/usdt_2.c b/tools/testing/selftests/bpf/usdt_2.c
> new file mode 100644
> index 000000000000..fcb7417a1953
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/usdt_2.c
> @@ -0,0 +1,13 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Include usdt.h with the extra info in usdt probe and
> + * nop/nop5 combo for usdt attach point.
> + */
> +#include "usdt.h"
> +
> +__attribute__((aligned(16)))
> +void usdt_2(void)
> +{
> +       USDT(optimized_attach, usdt_2);
> +}
> --
> 2.51.1
>