[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fUbrx9JKEO01Fc=J2LS-wy76Qc_yck+t5_crxh43z3CpA@mail.gmail.com>
Date: Tue, 2 Sep 2025 11:20:09 -0700
From: Ian Rogers <irogers@...gle.com>
To: Namhyung Kim <namhyung@...nel.org>, song@...nel.org,
Yonghong Song <yonghong.song@...ux.dev>, jolsa@...nel.org,
Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Blake Jones <blakejones@...gle.com>, ast@...nel.org, daniel@...earbox.net,
andrii@...nel.org, martin.lau@...ux.dev, eddyz87@...il.com,
john.fastabend@...il.com, kpsingh@...nel.org, sdf@...ichev.me,
haoluo@...gle.com, mykolal@...com, shuah@...nel.org, ihor.solodrai@...ux.dev,
bpf@...r.kernel.org, linux-kernel@...r.kernel.org,
linux-kselftest@...r.kernel.org,
linux-perf-users <linux-perf-users@...r.kernel.org>, Howard Chu <howardchu95@...il.com>
Subject: Re: [PATCH v3 1/2] libbpf: add support for printing BTF character
arrays as strings
On Tue, Sep 2, 2025 at 9:47 AM Ian Rogers <irogers@...gle.com> wrote:
>
> On Mon, Sep 1, 2025 at 10:04 AM Ian Rogers <irogers@...gle.com> wrote:
> >
> > On Mon, Sep 1, 2025 at 12:57 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > Hello,
> > >
> > > On Sun, Aug 31, 2025 at 09:17:34PM -0700, Yonghong Song wrote:
> > > >
> > > >
> > > > On 8/29/25 10:19 PM, Ian Rogers wrote:
> > > > > On Thu, Jun 5, 2025 at 2:00 PM <patchwork-bot+netdevbpf@...nel.org> wrote:
> > > > > > Hello:
> > > > > >
> > > > > > This series was applied to bpf/bpf-next.git (master)
> > > > > > by Andrii Nakryiko <andrii@...nel.org>:
> > > > > >
> > > > > > On Tue, 3 Jun 2025 13:37:00 -0700 you wrote:
> > > > > > > The BTF dumper code currently displays arrays of characters as just that -
> > > > > > > arrays, with each character formatted individually. Sometimes this is what
> > > > > > > makes sense, but it's nice to be able to treat that array as a string.
> > > > > > >
> > > > > > > This change adds a special case to the btf_dump functionality to allow
> > > > > > > 0-terminated arrays of single-byte integer values to be printed as
> > > > > > > character strings. Characters for which isprint() returns false are
> > > > > > > printed as hex-escaped values. This is enabled when the new ".emit_strings"
> > > > > > > is set to 1 in the btf_dump_type_data_opts structure.
> > > > > > >
> > > > > > > [...]
> > > > > > Here is the summary with links:
> > > > > > - [v3,1/2] libbpf: add support for printing BTF character arrays as strings
> > > > > > https://git.kernel.org/bpf/bpf-next/c/87c9c79a02b4
> > > > > > - [v3,2/2] Tests for the ".emit_strings" functionality in the BTF dumper.
> > > > > > https://git.kernel.org/bpf/bpf-next/c/a570f386f3d1
> > > > > >
> > > > > > You are awesome, thank you!
> > > > > I believe this patch is responsible for segvs occurring in v6.17 in
> > > > > various perf tests when the perf tests run in parallel. There's lots
> > > >
> > > > Could you share the command line to reproduce this failure?
> > > > This will help debugging. Thanks!
> > >
> > > My reproducer is below:
> > >
> > > terminal 1: run perf trace in a loop.
> > >
> > > $ while true; do sudo ./perf trace true; done
> > >
> > > terminal 2: run perf record in a loop until hit the segfault.
> > >
> > > $ while true; do sudo ./perf record true || break; done
> > > ...
> > > perf: Segmentation fault
> > > #0 0x560b2db790e4 in dump_stack debug.c:366
> > > #1 0x560b2db7915a in sighandler_dump_stack debug.c:378
> > > #2 0x560b2d973b1b in sigsegv_handler builtin-record.c:722
> > > #3 0x7f975f249df0 in __restore_rt libc_sigaction.c:0
> > > #4 0x560b2dca1ee6 in snprintf_hex bpf-event.c:39
> > > #5 0x560b2dca2306 in synthesize_bpf_prog_name bpf-event.c:144
> > > #6 0x560b2dca2d92 in bpf_metadata_create bpf-event.c:401
> > > #7 0x560b2dca3838 in perf_event__synthesize_one_bpf_prog bpf-event.c:673
> > > #8 0x560b2dca3dd5 in perf_event__synthesize_bpf_events bpf-event.c:798
> > > #9 0x560b2d977ef5 in record__synthesize builtin-record.c:2131
> > > #10 0x560b2d9797c1 in __cmd_record builtin-record.c:2581
> > > #11 0x560b2d97db30 in cmd_record builtin-record.c:4376
> > > #12 0x560b2da0672e in run_builtin perf.c:349
> > > #13 0x560b2da069c6 in handle_internal_command perf.c:401
> > > #14 0x560b2da06b1f in run_argv perf.c:448
> > > #15 0x560b2da06e68 in main perf.c:555
> > > #16 0x7f975f233ca8 in __libc_start_call_main libc_start_call_main.h:74
> > > #17 0x7f975f233d65 in __libc_start_main_alias_2 libc-start.c:128
> > > #18 0x560b2d959b11 in _start perf[4cb11]
> > >
> > >
> > > I manually ran it with gdb to get some more hints.
> > >
> > > Thread 1 "perf" received signal SIGSEGV, Segmentation fault.
> > > 0x00005555558e8ee6 in snprintf_hex (buf=0x5555562c1d79 "", size=503, data=0x40 <error: Cannot access memory at address 0x40>, len=8)
> > > at util/bpf-event.c:39
> > > 39 ret += snprintf(buf + ret, size - ret, "%02x", data[i]);
> > >
> > > The data is bpf_prog_info->prog_tags and it's called from
> > > synthesize_bpf_prog_name().
> > >
> > > (gdb) bt
> > > #0 0x00005555558e8ee6 in snprintf_hex (buf=0x5555562c1d79 "", size=503, data=0x40 <error: Cannot access memory at address 0x40>,
> > > len=8) at util/bpf-event.c:39
> > > #1 0x00005555558e9306 in synthesize_bpf_prog_name (buf=0x5555562c1d70 "bpf_prog_", size=512, info=0x55555665e400, btf=0x5555562c5630,
> > > sub_id=0) at util/bpf-event.c:144
> > > #2 0x00005555558e9db5 in bpf_metadata_create (info=0x55555665e400) at util/bpf-event.c:403
> > > #3 0x00005555558ea85b in perf_event__synthesize_one_bpf_prog (session=0x555556178510,
> > > process=0x5555555ba7ab <process_synthesized_event>, machine=0x555556178728, fd=25, event=0x5555561b73a0,
> > > opts=0x5555560d33a8 <record+328>) at util/bpf-event.c:674
> > > #4 0x00005555558eadf8 in perf_event__synthesize_bpf_events (session=0x555556178510,
> > > process=0x5555555ba7ab <process_synthesized_event>, machine=0x555556178728, opts=0x5555560d33a8 <record+328>)
> > > at util/bpf-event.c:799
> > > #5 0x00005555555beef5 in record__synthesize (rec=0x5555560d3260 <record>, tail=false) at builtin-record.c:2131
> > > #6 0x00005555555c07c1 in __cmd_record (rec=0x5555560d3260 <record>, argc=1, argv=0x7fffffffe2e0) at builtin-record.c:2581
> > > #7 0x00005555555c4b30 in cmd_record (argc=1, argv=0x7fffffffe2e0) at builtin-record.c:4376
> > > #8 0x000055555564d72e in run_builtin (p=0x5555560d63c0 <commands+288>, argc=6, argv=0x7fffffffe2e0) at perf.c:349
> > > #9 0x000055555564d9c6 in handle_internal_command (argc=6, argv=0x7fffffffe2e0) at perf.c:401
> > > #10 0x000055555564db1f in run_argv (argcp=0x7fffffffe0dc, argv=0x7fffffffe0d0) at perf.c:445
> > > #11 0x000055555564de68 in main (argc=6, argv=0x7fffffffe2e0) at perf.c:553
> > >
> > > I seems bpf_prog_info is broken for some reason.
> > >
> > > (gdb) up
> > > #1 0x00005555558e9306 in synthesize_bpf_prog_name (buf=0x5555563305b0 "bpf_prog_", size=512, info=0x55555664e1d0, btf=0x55555637ad40,
> > > sub_id=0) at util/bpf-event.c:144
> > > 144 name_len += snprintf_hex(buf + name_len, size - name_len,
> > >
> > > (gdb) p *info
> > > $1 = {type = 68, id = 80, tag = "\\\000\000\000\214\000\000", jited_prog_len = 152, xlated_prog_len = 164,
> > > jited_prog_insns = 824633721012, xlated_prog_insns = 1185410973912, load_time = 1305670058276, created_by_uid = 352,
> > > nr_map_ids = 364, map_ids = 1975684956608, name = "\330\001\000\000\350\001\000\000$\002\000\0004\002\000", ifindex = 576,
> > > gpl_compatible = 0, netns_dev = 2697239462496, netns_ino = 2834678416000, nr_jited_ksyms = 756, nr_jited_func_lens = 768,
> > > jited_ksyms = 3418793968396, jited_func_lens = 3573412791092, btf_id = 844, func_info_rec_size = 880, func_info = 3934190044028,
> > > nr_func_info = 928, nr_line_info = 952, line_info = 4294967296988, jited_line_info = 4449586119680, nr_jited_line_info = 1060,
> > > line_info_rec_size = 1076, jited_line_info_rec_size = 1092, nr_prog_tags = 1108, prog_tags = 4861902980192,
> > > run_time_ns = 5085241279632, run_cnt = 5257039971512, recursion_misses = 5360119186644, verified_insns = 1264,
> > > attach_btf_obj_id = 1288, attach_btf_id = 1312}
> >
> > Thanks Namhyung!
> >
> > So it looks like my "fix" was breaking the tools/perf build feature
> > test for btf_dump_type_data_opts opts.emit_strings and that was
> > avoiding this code.
> >
> > Having terminal 1 run perf trace is going to be loading/unloading a
> > BPF program for system call augmentation. This must be creating the
> > race condition that is causing perf record to segv when it is
> > inspecting the bpf_prog_info.
> >
> > The cast in:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-event.c#n135
> > ```
> > static int synthesize_bpf_prog_name(char *buf, int size,
> > struct bpf_prog_info *info,
> > struct btf *btf,
> > u32 sub_id)
> > {
> > u8 (*prog_tags)[BPF_TAG_SIZE] = (void *)(uintptr_t)(info->prog_tags);
> > ```
> > looks concerning given the bad address comes from:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-event.c#n144
> > ```
> > name_len += snprintf_hex(buf + name_len, size - name_len,
> > prog_tags[sub_id], BPF_TAG_SIZE);
> > ```
> > Checking git blame this code has existed since 2019, commit
> > 7b612e291a5a ("perf tools: Synthesize PERF_RECORD_* for loaded BPF
> > programs"):
> > http://lkml.kernel.org/r/20190117161521.1341602-8-songliubraving@fb.com
> > it was refactored in 2019 to a single memory allocation commit
> > ("a742258af131 perf bpf: Synthesize bpf events with
> > bpf_program__get_prog_info_linear()")
> > http://lkml.kernel.org/r/20190312053051.2690567-5-songliubraving@fb.com
> >
> > There seems like a potential race here:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-utils.c#n123
> > ```
> > /* step 1: get array dimensions */
> > err = bpf_obj_get_info_by_fd(fd, &info, &info_len);
> > ```
> > and later:
> > https://web.git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/perf/util/bpf-utils.c#n185
> > ```
> > /* step 5: call syscall again to get required arrays */
> > err = bpf_obj_get_info_by_fd(fd, &info_linear->info, &info_len);
> > ```
> > There's a verification step that looks to cover issues with the race.
> > I thought making those warnings fatal may help, but no:
> > ```
> > --- a/tools/perf/util/bpf-utils.c
> > +++ b/tools/perf/util/bpf-utils.c
> > @@ -202,14 +202,20 @@ get_bpf_prog_info_linear(int fd, __u64 arrays)
> > v1 = bpf_prog_info_read_offset_u32(&info, desc->count_offset);
> > v2 = bpf_prog_info_read_offset_u32(&info_linear->info,
> > desc->count_offset);
> > - if (v1 != v2)
> > + if (v1 != v2) {
> > pr_warning("%s: mismatch in element count\n", __func__);
> > + free(info_linear);
> > + return ERR_PTR(-EFAULT);
> > + }
> >
> > v1 = bpf_prog_info_read_offset_u32(&info, desc->size_offset);
> > v2 = bpf_prog_info_read_offset_u32(&info_linear->info,
> > desc->size_offset);
> > - if (v1 != v2)
> > + if (v1 != v2) {
> > pr_warning("%s: mismatch in rec size\n", __func__);
> > + free(info_linear);
> > + return ERR_PTR(-EFAULT);
> > + }
> > }
> >
> > /* step 7: update info_len and data_len */
> > ```
> >
> > Fwiw, the address of "data=0x40" in the stack trace makes it looks
> > like an offset has been applied to NULL. 0x40 is 64 which corresponds
> > with "name" info a bpf_prog_info by way of pahole:
> > ```
> > struct bpf_prog_info {
> > __u32 type; /* 0 4 */
> > /* --- cacheline 1 boundary (64 bytes) --- */
> > ...
> > char name[16]; /* 64 16 */
> > ```
> >
> > I feel we're relatively close to discovering a proper fix for the
> > issue, if others could lend a hand as I'm not overly familiar with the
> > BPF code. I'm wondering if the second bpf_obj_get_info_by_fd could be
> > filling in offsets relative to NULL rather than returning an error,
> > but this would be (I believe) a kernel issue :-(
>
> It's a use-after-free perf bug:
> https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/bpf-event.c?h=perf-tools-next#n660
> ```
> /* save bpf_prog_info to env */
> info_node = malloc(sizeof(struct bpf_prog_info_node));
> if (!info_node) {
> err = -1;
> goto out;
> }
>
> info_node->info_linear = info_linear;
> info_node->metadata = NULL;
> if (!perf_env__insert_bpf_prog_info(env, info_node)) {
> free(info_linear);
> free(info_node);
> }
> info_linear = NULL;
> ...
> /* Synthesize PERF_RECORD_BPF_METADATA */
> metadata = bpf_metadata_create(info);
> ```
> The insert can fail due to a duplicated BPF program (the sideband data
> thread is racing with the synthesizing thread in perf) leading to
> info_linear to be freed which the info points into. The later use of
> the freed memory leads to the crash.
>
> I hope that's all it is and I'll work on a fix.
That was all. Fixes sent in:
https://lore.kernel.org/all/20250902181713.309797-1-irogers@google.com/
Thanks,
Ian
Powered by blists - more mailing lists