lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9f1182a6-fac7-41b0-b6db-24ff64afa8b2@linaro.org>
Date: Tue, 13 Jan 2026 12:03:51 +0000
From: James Clark <james.clark@...aro.org>
To: Ian Rogers <irogers@...gle.com>
Cc: Tony Jones <tonyj@...e.de>, Peter Zijlstra <peterz@...radead.org>,
 Ingo Molnar <mingo@...hat.com>, Arnaldo Carvalho de Melo <acme@...nel.org>,
 Namhyung Kim <namhyung@...nel.org>,
 Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
 Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
 Howard Chu <howardchu95@...il.com>,
 Stephen Brennan <stephen.s.brennan@...cle.com>,
 linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v3 0/7] perf: Add a libdw addr2line implementation



On 12/01/2026 6:29 pm, Ian Rogers wrote:
> On Mon, Jan 12, 2026 at 6:49 AM Ian Rogers <irogers@...gle.com> wrote:
>>
>> On Mon, Jan 12, 2026 at 3:18 AM James Clark <james.clark@...aro.org> wrote:
>>>
>>> On 11/01/2026 4:13 am, Ian Rogers wrote:
>>>> addr2line is a performance bottleneck in perf, add a libdw based
>>>> implementation that avoids forking addr2line and caches the decoded
>>>> debug information.
>>>>
>>>> Allow the addr2line implementation to be picked via the configuration
>>>> file or --addr2line-style with `perf report`.
>>>>
>>>> Test/fix that inline callchains are properly displayed by perf script.
>>>>
>>>> An example:
>>>> ```
>>>> $ perf record --call-graph dwarf -e cycles:u -- perf test -w inlineloop 1
>>>> [ perf record: Woken up 132 times to write data ]
>>>> [ perf record: Captured and wrote 32.814 MB perf.data (4074 samples) ]
>>>> $ perf script --fields +srcline
>>>> ...
>>>> perf-inlineloop 1814670 293100.228871:     640004 cpu_core/cycles/u:
>>>>               55a11d6e61ee leaf+0x2e
>>>>     inlineloop.c:21 (inlined)
>>>>               55a11d6e61ee middle+0x2e
>>>>     inlineloop.c:27 (inlined)
>>>>               55a11d6e61ee parent+0x2e (perf)
>>>>     inlineloop.c:32
>>>>               55a11d6e629b inlineloop+0x8b (perf)
>>>>     inlineloop.c:47
>>>>               55a11d69a3bc run_workload+0x5a (perf)
>>>>     builtin-test.c:715
>>>>               55a11d69aa9f cmd_test+0x417 (perf)
>>>>     builtin-test.c:825
>>>>               55a11d6155f5 run_builtin+0xd4 (perf)
>>>>     perf.c:349
>>>>               55a11d61588d handle_internal_command+0xdd (perf)
>>>>     perf.c:401
>>>>               55a11d6159e6 run_argv+0x35 (perf)
>>>>     perf.c:445
>>>>               55a11d615d2f main+0x2cb (perf)
>>>>     perf.c:553
>>>>               7fae3d233ca7 __libc_start_call_main+0x77 (libc.so.6)
>>>>     libc_start_call_main.h:58
>>>>               7fae3d233d64 __libc_start_main_impl+0x84
>>>>     libc-start.c:360 (inlined)
>>>>               55a11d565f80 _start+0x20 (perf)
>>>>     ??:0
>>>> ...
>>>> ```
>>>>
>>>> v3: Make the caller inline file and line number accurate in the libdw
>>>>       addr2line, rather than using the function's declared location.
>>>>       Fix reference counts in unwind-libdw. Add fixes tag for srcline
>>>>       inline printing.
>>>>
>>>> v2: Fix bias issue with libdwfl functions. Use cu_walk_functions_at
>>>>       from perf's dwarf-aux to fully walk inline functions. Add testing
>>>>       that inlined functions are shown in the perf script srcline
>>>>       callchain information. Add configurability as to which addr2line
>>>>       style to use.
>>>>       https://lore.kernel.org/lkml/20260110082647.1487574-1-irogers@google.com/
>>>>
>>>> v1: https://lore.kernel.org/lkml/20251122093934.94971-1-irogers@google.com/
>>>>
>>>> Ian Rogers (7):
>>>>     perf unwind-libdw: Fix invalid reference counts
>>>>     perf addr2line: Add a libdw implementation
>>>>     perf addr2line.c: Rename a2l_style to cmd_a2l_style
>>>>     perf srcline: Add configuration support for the addr2line style
>>>>     perf callchain: Fix srcline printing with inlines
>>>>     perf test workload: Add inlineloop test workload
>>>>     perf test: Test addr2line unwinding works with inline functions
>>>>
>>>>    tools/perf/builtin-report.c                 |  10 ++
>>>>    tools/perf/tests/builtin-test.c             |   1 +
>>>>    tools/perf/tests/shell/addr2line_inlines.sh |  47 ++++++
>>>>    tools/perf/tests/tests.h                    |   1 +
>>>>    tools/perf/tests/workloads/Build            |   2 +
>>>>    tools/perf/tests/workloads/inlineloop.c     |  52 +++++++
>>>>    tools/perf/util/Build                       |   1 +
>>>>    tools/perf/util/addr2line.c                 |  20 +--
>>>>    tools/perf/util/config.c                    |   4 +
>>>>    tools/perf/util/dso.c                       |   2 +
>>>>    tools/perf/util/dso.h                       |  11 ++
>>>>    tools/perf/util/evsel_fprintf.c             |   8 +-
>>>>    tools/perf/util/libdw.c                     | 153 ++++++++++++++++++++
>>>>    tools/perf/util/libdw.h                     |  60 ++++++++
>>>>    tools/perf/util/srcline.c                   | 116 ++++++++++++++-
>>>>    tools/perf/util/srcline.h                   |   3 +
>>>>    tools/perf/util/symbol_conf.h               |  10 ++
>>>>    tools/perf/util/unwind-libdw.c              |   7 +-
>>>>    18 files changed, 486 insertions(+), 22 deletions(-)
>>>>    create mode 100755 tools/perf/tests/shell/addr2line_inlines.sh
>>>>    create mode 100644 tools/perf/tests/workloads/inlineloop.c
>>>>    create mode 100644 tools/perf/util/libdw.c
>>>>    create mode 100644 tools/perf/util/libdw.h
>>>>
>>>
>>> I don't see the differences to the other addr2line implementations
>>> anymore, but only because it falls through to the old ones when libdw
>>> fails now.
>>>
>>> For example when building Perf with LLVM it can't get the line in the
>>> inlineloop workload, and there's still a few things in libc and other
>>> system libraries it fails on.
>>
>> Hmm.. I wonder what the issue is. I was looking at the dwarf output
>> from my gcc builds with llvm-dwarfdump. I wonder if LLVM builds are

I see some issues in libc on Ubuntu though, which I assume is compiled 
with GCC, although there's no .comment section in it so I can't be sure. 
So it's not exclusively LLVM but it does seem like LLVM builds cause a 
lot more failures.

>> doing something to confuse libdw? I'll try to investigate. There are
>> quite a few levels of libdw: there's the raw libdw, libdwfl (frontend
>> to libdw) that does the parsing and tries to give things like nested
>> debug scopes (libdwfl is the one needing addresses with a module bias
>> rather than raw file offsets), and then there is the dwarf-aux.c that
>> is in perf and is used by things like probe finding (I believe this
>> doesn't need biases addresses). Anyway, with the biases there are
>> things I can screw up (like in the v1 patch) but maybe the LLVM issue
>> is just a libdw and dwarf-5 kind of thing. Maybe it is ARM specific
>> :-/

Actually I get the same behavior on Arm and x86.

> 
> Testing with clang/llvm on x86-64 (dwarf5):
> ```
> $ make -C tools/perf O=/tmp/perf DEBUG=1 CC=clang CXX=clang++
> HOSTCC=clang clean all
> ...
> $ llvm-dwarfdump /tmp/perf/perf
> ...
> 0x0014f852: Compile Unit: length = 0x00000294, format = DWARF32,
> version = 0x0005, unit_type = DW_UT_compile,
> abbr_offset = 0x1879a, addr_size = 0x08 (next unit at 0x0014faea)
> 
> 0x0014f85e: DW_TAG_compile_unit
>               DW_AT_producer    ("Debian clang version 19.1.7 (3+build5)")
>               DW_AT_language    (DW_LANG_C11)
>               DW_AT_name        ("tests/workloads/inlineloop.c")
>               DW_AT_str_offsets_base    (0x0004a550)
>               DW_AT_stmt_list   (0x0008c3f2)
>               DW_AT_comp_dir    ("linux/tools/perf")
>               DW_AT_low_pc      (0x00000000001e61c0)
>               DW_AT_high_pc     (0x00000000001e62e9)
>               DW_AT_addr_base   (0x00022248)
>               DW_AT_loclists_base       (0x0000018a)
> ...
> $  sudo /tmp/perf/perf record --call-graph dwarf -e cycles:u --
> /tmp/perf/perf test -w inlineloop 1
> ...
> $  sudo /tmp/perf/perf script --fields +srcline
> ...
> perf-inlineloop 2284167 423038.015394:     569917 cpu_core/cycles/u:
>             56390020d2c6 leaf+0x26
>   inlineloop.c:21 (inlined)
>             56390020d2c6 middle+0x26
>   inlineloop.c:27 (inlined)
>             56390020d2c6 parent+0x26 (/tmp/perf/perf)
> ...
> ```
> I ran inside of gdb and confirmed that the libdw code is creating the
> inlined information (breakpoint on libdw_a2l_cb, etc.). So I'm not
> able to reproduce the LLVM issue for now on x86-64.
> 
> Thanks,
> Ian
> 

If I set this in ~/.perfconfig so the fallback is disabled:

  [addr2line]
	style = libdw

Then:

  $ make LLVM=1 -C tools/perf DEBUG=1 clean all
  $ perf record --delay 1000 -- perf test -w inlineloop 2
  $ perf script --fields ip,srcline
      6012b5957b70
   perf[1f7b70]
      6012b5957b70
   perf[1f7b70]
   ...


x86:

  $ clang -v
  Ubuntu clang version 15.0.7

Arm:

  $ clang -v
  Ubuntu clang version 18.1.8 (11~20.04.2)

Disabling the ~/.perfconfig to re-enable the LLVM fallback works:

(x86)
$ perf script --fields ip,srcline
      6012b5957b70
   inlineloop.c:20
      6012b5957b70
   inlineloop.c:20

Interestingly, on Arm this results in zeros for line numbers. This is a 
completely different issue though which I didn't notice before because I 
built with GCC. It falls all the way back to A2L_STYLE_CMD:

(Arm)
$ perf script --fields ip,srcline
      aaaad0a7828c
   inlineloop.c:0
      aaaad0a7828c
   inlineloop.c:0

$ addr2line -e `which perf` -a -i -f aaaad0a7828c
0x0000aaaad0a7828c
??
??:0

Probably shouldn't get sidetracked by that here though. It's at least 
working when compiled with GCC, and neither LLVM or libdw work, so it's 
no worse.

>>> But I think it's fine because it doesn't give the wrong line anymore, it
>>> just falls through to another working addr2line implementation.
>>
>> Just to confirm that with gcc builds it isn't failing now? ie it isn't
>> just an addr2line implementation that falls through all the time? I
>> was seeing things working/testing on x86 with gcc.
>>

No, the GCC Perf build always works with libdw as far as I can see. Just 
the occasional fall through to LLVM with some libc addresses.

>>> Reviewed-by: James Clark <james.clark@...aro.org>
>>
>> Thanks,
>> Ian


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ