[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAP-5=fW01urmSotcEHauc3xVUPqDRhkBttCvpa=6FuJyRaxiEw@mail.gmail.com>
Date: Thu, 27 Nov 2025 05:19:53 -0800
From: Ian Rogers <irogers@...gle.com>
To: James Clark <james.clark@...aro.org>
Cc: Namhyung Kim <namhyung@...nel.org>, Tony Jones <tonyj@...e.de>,
Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
Jiri Olsa <jolsa@...nel.org>, Adrian Hunter <adrian.hunter@...el.com>,
Howard Chu <howardchu95@...il.com>, Stephen Brennan <stephen.s.brennan@...cle.com>,
linux-kernel@...r.kernel.org, linux-perf-users@...r.kernel.org
Subject: Re: [PATCH v1] perf addr2line: Add a libdw implementation
On Thu, Nov 27, 2025 at 4:16 AM James Clark <james.clark@...aro.org> wrote:
>
>
>
> On 27/11/2025 11:43 am, Ian Rogers wrote:
> > On Wed, Nov 26, 2025 at 10:27 AM Namhyung Kim <namhyung@...nel.org> wrote:
> >>
> >> On Sat, Nov 22, 2025 at 01:39:34AM -0800, Ian Rogers wrote:
> >>> Add an implementation of addr2line that uses libdw. Other addr2line
> >>> implementations or, in the case of forking addr2line, slow. Add an
> >>> implementation that caches the libdw information in the dso and uses
> >>> it to find the file and line number information.
> >
> > Thanks James and Namhyung for the reviews! I agree with James' comment
> > about a typo in the commit message.
> >
> >> My concern is the limit in the open file descriptors in case the data
> >> touched a lot of different libraries. The DSO code has some logic to
> >> deal with it but I'm not sure if we can share that since libdw seems to
> >> want to own the FD.
> >
> > The code opens the FD:
> >
> > + fd = open(dso_name, O_RDONLY);
> > + if (fd < 0)
> > + return 0;
> > +
> > + dwfl = dwfl_begin(&offline_callbacks);
> > + if (!dwfl) {
> > + close(fd);
> > + return 0;
> > + }
> >
> > It then uses the FD and closes it (the close is hidden in libdw itself):
> >
> > + /*
> > + * If the report is successful, the file descriptor fd
> > is consumed
> > + * and closed by the Dwfl. If not, it is not closed.
> > + */
> > + mod = dwfl_report_offline(dwfl, dso_name, dso_name, fd);
> >
> > So it is possible we exhaust all the file descriptors if there are
> > multiple concurrent calls to libdw__addr2line and every dso has
> > missing libdw dwfl data... but because the open/close are close
> > together and that 1 FD is say small to the FDs needed for the
> > cmd__addr2line, I don't think it is a problem we need to specifically
> > handle. Were the FD kept open until the dso was deleted, I'd agree
> > with you.
> >
> >> Also, have you checked if this generates the exactly same output with
> >> other implementations?
> >
> > So the code passes `perf test` and I was checking functionality with
> > perf annotate, top, etc. What I saw looked good, but it may not have
> > been exhaustive. I didn't specifically create a test that compares the
> > output of the different addr2line implementations. Such a test would
> > be possible, it's not something we've done elsewhere.
> >
> > Thanks,
> > Ian
> >
>
> I manually looked at a couple of line numbers and they looked
> reasonable. I think an automated test that compared dwarf decoders would
> be a bit of a nightmare because I'm sure there would always be subtle
> differences.
>
> Doing a manual side by side comparison of libdw__addr2line() and
> llvm__addr2line(), they seem to be quite different:
>
> $ perf record -- perf test -w leafloop
> $ perf script -F ip,srcline > libdw_addr2line.txt
> # Comment out libdw_addr2line() and rebuild
> $ perf script -F ip,srcline > llvm_addr2line.txt
> $ diff libdw_addr2line.txt llvm_addr2line.txt
>
> It gets all of the simple leafloop workload lines the same, but lots of
> the libc stuff is different.
>
> For example libdw gives cpu-features.c:350 where llvm gives strcmp.S. At
> least that one looks like inlining so llvm might be "better", but where
> it gives dl-cache.c:490 isntead of llvm's mmap64.c:47, it's hard to see
> how that can be inlining if they're in different compilation units.
>
> If the llvm addr2line implementation is also supposed to be slow, it
> just means we're trading speed with accuracy with this change. Hard to
> say what the default should be in that case.
Agreed. We could do some kind of option scheme like with the disassemblers:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/disasm.c?h=perf-tools-next#n1715
I'm not sure there are currently any distributions using the LLVM
options, this in part motivates:
https://lore.kernel.org/lkml/20251007163835.3152881-1-irogers@google.com/
but the addr2line APIs are a pain there - see the extra generated .so.
Because of that it could be reasonable to delete the LLVM addr2line
code and just focus on libdw. I kind of wish we could also delete the
precarious command forking versions too. And libbfd... Perhaps with
this code, those advocating libbfd have a fast enough alternative
without bringing in large dependencies like libLLVM.
Thanks,
Ian
> >> Thanks,
> >> Namhyung
> >>
> >>>
> >>> Signed-off-by: Ian Rogers <irogers@...gle.com>
> >>> ---
> >>> tools/perf/util/Build | 1 +
> >>> tools/perf/util/dso.c | 2 +
> >>> tools/perf/util/dso.h | 11 +++
> >>> tools/perf/util/libdw.c | 136 ++++++++++++++++++++++++++++++++++++++
> >>> tools/perf/util/libdw.h | 60 +++++++++++++++++
> >>> tools/perf/util/srcline.c | 5 ++
> >>> 6 files changed, 215 insertions(+)
> >>> create mode 100644 tools/perf/util/libdw.c
> >>> create mode 100644 tools/perf/util/libdw.h
> >>>
> >>> diff --git a/tools/perf/util/Build b/tools/perf/util/Build
> >>> index 1c2a43e1dc68..2bed6274e248 100644
> >>> --- a/tools/perf/util/Build
> >>> +++ b/tools/perf/util/Build
> >>> @@ -224,6 +224,7 @@ perf-util-$(CONFIG_LIBDW) += dwarf-regs-powerpc.o
> >>> perf-util-$(CONFIG_LIBDW) += dwarf-regs-x86.o
> >>> perf-util-$(CONFIG_LIBDW) += debuginfo.o
> >>> perf-util-$(CONFIG_LIBDW) += annotate-data.o
> >>> +perf-util-$(CONFIG_LIBDW) += libdw.o
> >>>
> >>> perf-util-$(CONFIG_LIBDW_DWARF_UNWIND) += unwind-libdw.o
> >>> perf-util-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind-local.o
> >>> diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
> >>> index 344e689567ee..06980844c014 100644
> >>> --- a/tools/perf/util/dso.c
> >>> +++ b/tools/perf/util/dso.c
> >>> @@ -32,6 +32,7 @@
> >>> #include "string2.h"
> >>> #include "vdso.h"
> >>> #include "annotate-data.h"
> >>> +#include "libdw.h"
> >>>
> >>> static const char * const debuglink_paths[] = {
> >>> "%.0s%s",
> >>> @@ -1605,6 +1606,7 @@ void dso__delete(struct dso *dso)
> >>> auxtrace_cache__free(RC_CHK_ACCESS(dso)->auxtrace_cache);
> >>> dso_cache__free(dso);
> >>> dso__free_a2l(dso);
> >>> + dso__free_a2l_libdw(dso);
> >>> dso__free_symsrc_filename(dso);
> >>> nsinfo__zput(RC_CHK_ACCESS(dso)->nsinfo);
> >>> mutex_destroy(dso__lock(dso));
> >>> diff --git a/tools/perf/util/dso.h b/tools/perf/util/dso.h
> >>> index f8ccb9816b89..4aee23775054 100644
> >>> --- a/tools/perf/util/dso.h
> >>> +++ b/tools/perf/util/dso.h
> >>> @@ -268,6 +268,7 @@ DECLARE_RC_STRUCT(dso) {
> >>> const char *short_name;
> >>> const char *long_name;
> >>> void *a2l;
> >>> + void *a2l_libdw;
> >>> char *symsrc_filename;
> >>> #if defined(__powerpc__)
> >>> void *dwfl; /* DWARF debug info */
> >>> @@ -334,6 +335,16 @@ static inline void dso__set_a2l(struct dso *dso, void *val)
> >>> RC_CHK_ACCESS(dso)->a2l = val;
> >>> }
> >>>
> >>> +static inline void *dso__a2l_libdw(const struct dso *dso)
> >>> +{
> >>> + return RC_CHK_ACCESS(dso)->a2l_libdw;
> >>> +}
> >>> +
> >>> +static inline void dso__set_a2l_libdw(struct dso *dso, void *val)
> >>> +{
> >>> + RC_CHK_ACCESS(dso)->a2l_libdw = val;
> >>> +}
> >>> +
> >>> static inline unsigned int dso__a2l_fails(const struct dso *dso)
> >>> {
> >>> return RC_CHK_ACCESS(dso)->a2l_fails;
> >>> diff --git a/tools/perf/util/libdw.c b/tools/perf/util/libdw.c
> >>> new file mode 100644
> >>> index 000000000000..c4331fa8e1a3
> >>> --- /dev/null
> >>> +++ b/tools/perf/util/libdw.c
> >>> @@ -0,0 +1,136 @@
> >>> +// SPDX-License-Identifier: GPL-2.0
> >>> +#include "dso.h"
> >>> +#include "libdw.h"
> >>> +#include "srcline.h"
> >>> +#include "symbol.h"
> >>> +#include "dwarf-aux.h"
> >>> +#include <fcntl.h>
> >>> +#include <unistd.h>
> >>> +#include <elfutils/libdwfl.h>
> >>> +
> >>> +void dso__free_a2l_libdw(struct dso *dso)
> >>> +{
> >>> + Dwfl *dwfl = dso__a2l_libdw(dso);
> >>> +
> >>> + if (dwfl) {
> >>> + dwfl_end(dwfl);
> >>> + dso__set_a2l_libdw(dso, NULL);
> >>> + }
> >>> +}
> >>> +
> >>> +int libdw__addr2line(const char *dso_name, u64 addr,
> >>> + char **file, unsigned int *line_nr,
> >>> + struct dso *dso, bool unwind_inlines,
> >>> + struct inline_node *node, struct symbol *sym)
> >>> +{
> >>> + static const Dwfl_Callbacks offline_callbacks = {
> >>> + .find_debuginfo = dwfl_standard_find_debuginfo,
> >>> + .section_address = dwfl_offline_section_address,
> >>> + .find_elf = dwfl_build_id_find_elf,
> >>> + };
> >>> + Dwfl *dwfl = dso__a2l_libdw(dso);
> >>> + Dwfl_Module *mod;
> >>> + Dwfl_Line *dwline;
> >>> + Dwarf_Addr bias;
> >>> + const char *src;
> >>> + int lineno;
> >>> +
> >>> + if (!dwfl) {
> >>> + /*
> >>> + * Initialize Dwfl session.
> >>> + * We need to open the DSO file to report it to libdw.
> >>> + */
> >>> + int fd;
> >>> +
> >>> + fd = open(dso_name, O_RDONLY);
> >>> + if (fd < 0)
> >>> + return 0;
> >>> +
> >>> + dwfl = dwfl_begin(&offline_callbacks);
> >>> + if (!dwfl) {
> >>> + close(fd);
> >>> + return 0;
> >>> + }
> >>> +
> >>> + /*
> >>> + * If the report is successful, the file descriptor fd is consumed
> >>> + * and closed by the Dwfl. If not, it is not closed.
> >>> + */
> >>> + mod = dwfl_report_offline(dwfl, dso_name, dso_name, fd);
> >>> + if (!mod) {
> >>> + dwfl_end(dwfl);
> >>> + close(fd);
> >>> + return 0;
> >>> + }
> >>> +
> >>> + dwfl_report_end(dwfl, /*removed=*/NULL, /*arg=*/NULL);
> >>> + dso__set_a2l_libdw(dso, dwfl);
> >>> + } else {
> >>> + /* Dwfl session already initialized, get module for address. */
> >>> + mod = dwfl_addrmodule(dwfl, addr);
> >>> + }
> >>> +
> >>> + if (!mod)
> >>> + return 0;
> >>> +
> >>> + /* Find source line information for the address. */
> >>> + dwline = dwfl_module_getsrc(mod, addr);
> >>> + if (!dwline)
> >>> + return 0;
> >>> +
> >>> + /* Get line information. */
> >>> + src = dwfl_lineinfo(dwline, &addr, &lineno, /*col=*/NULL, /*mtime=*/NULL, /*length=*/NULL);
> >>> +
> >>> + if (file)
> >>> + *file = src ? strdup(src) : NULL;
> >>> + if (line_nr)
> >>> + *line_nr = lineno;
> >>> +
> >>> + /* Optionally unwind inline function call chain. */
> >>> + if (unwind_inlines && node && src) {
> >>> + Dwarf_Die *cudie = dwfl_module_addrdie(mod, addr, &bias);
> >>> + Dwarf_Die *scopes = NULL;
> >>> + int nscopes;
> >>> +
> >>> + if (!cudie)
> >>> + return 1;
> >>> +
> >>> + nscopes = die_get_scopes(cudie, addr, &scopes);
> >>> + if (nscopes > 0) {
> >>> + int i;
> >>> + const char *call_file = src;
> >>> + unsigned int call_line = lineno;
> >>> +
> >>> + for (i = 0; i < nscopes; i++) {
> >>> + Dwarf_Die *die = &scopes[i];
> >>> + struct symbol *inline_sym;
> >>> + char *srcline = NULL;
> >>> + int tag = dwarf_tag(die);
> >>> +
> >>> + /* We are interested in inlined subroutines. */
> >>> + if (tag != DW_TAG_inlined_subroutine &&
> >>> + tag != DW_TAG_subprogram)
> >>> + continue;
> >>> +
> >>> + inline_sym = new_inline_sym(dso, sym, dwarf_diename(die));
> >>> +
> >>> + if (call_file)
> >>> + srcline = srcline_from_fileline(call_file, call_line);
> >>> +
> >>> + inline_list__append(inline_sym, srcline, node);
> >>> +
> >>> + /* Update call site for next level. */
> >>> + if (tag == DW_TAG_inlined_subroutine) {
> >>> + call_file = die_get_call_file(die);
> >>> + call_line = die_get_call_lineno(die);
> >>> + } else {
> >>> + /* Reached the root subprogram. */
> >>> + break;
> >>> + }
> >>> + }
> >>> + free(scopes);
> >>> + }
> >>> + }
> >>> +
> >>> + return 1;
> >>> +}
> >>> diff --git a/tools/perf/util/libdw.h b/tools/perf/util/libdw.h
> >>> new file mode 100644
> >>> index 000000000000..0f8d7b4a11a5
> >>> --- /dev/null
> >>> +++ b/tools/perf/util/libdw.h
> >>> @@ -0,0 +1,60 @@
> >>> +/* SPDX-License-Identifier: GPL-2.0 */
> >>> +#ifndef PERF_LIBDW_H
> >>> +#define PERF_LIBDW_H
> >>> +
> >>> +#include <linux/types.h>
> >>> +
> >>> +struct dso;
> >>> +struct inline_node;
> >>> +struct symbol;
> >>> +
> >>> +#ifdef HAVE_LIBDW_SUPPORT
> >>> +/*
> >>> + * libdw__addr2line - Convert address to source location using libdw
> >>> + * @dso_name: Name of the DSO
> >>> + * @addr: Address to resolve
> >>> + * @file: Pointer to return filename (caller must free)
> >>> + * @line_nr: Pointer to return line number
> >>> + * @dso: The dso struct
> >>> + * @unwind_inlines: Whether to unwind inline function calls
> >>> + * @node: Inline node list to append to
> >>> + * @sym: The symbol associated with the address
> >>> + *
> >>> + * This function initializes a Dwfl context for the DSO if not already present,
> >>> + * finds the source line information for the given address, and optionally
> >>> + * resolves inline function call chains.
> >>> + *
> >>> + * Returns 1 on success (found), 0 on failure (not found).
> >>> + */
> >>> +int libdw__addr2line(const char *dso_name, u64 addr, char **file,
> >>> + unsigned int *line_nr, struct dso *dso,
> >>> + bool unwind_inlines, struct inline_node *node,
> >>> + struct symbol *sym);
> >>> +
> >>> +/*
> >>> + * dso__free_a2l_libdw - Free libdw resources associated with the DSO
> >>> + * @dso: The dso to free resources for
> >>> + *
> >>> + * This function cleans up the Dwfl context used for addr2line lookups.
> >>> + */
> >>> +void dso__free_a2l_libdw(struct dso *dso);
> >>> +
> >>> +#else /* HAVE_LIBDW_SUPPORT */
> >>> +
> >>> +static inline int libdw__addr2line(const char *dso_name __maybe_unused,
> >>> + u64 addr __maybe_unused, char **file __maybe_unused,
> >>> + unsigned int *line_nr __maybe_unused,
> >>> + struct dso *dso __maybe_unused,
> >>> + bool unwind_inlines __maybe_unused,
> >>> + struct inline_node *node __maybe_unused,
> >>> + struct symbol *sym __maybe_unused)
> >>> +{
> >>> + return 0;
> >>> +}
> >>> +
> >>> +static inline void dso__free_a2l_libdw(struct dso *dso __maybe_unused)
> >>> +{
> >>> +}
> >>> +#endif /* HAVE_LIBDW_SUPPORT */
> >>> +
> >>> +#endif /* PERF_LIBDW_H */
> >>> diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
> >>> index 27c0966611ab..4b456c4d4138 100644
> >>> --- a/tools/perf/util/srcline.c
> >>> +++ b/tools/perf/util/srcline.c
> >>> @@ -6,6 +6,7 @@
> >>> #include "libbfd.h"
> >>> #include "llvm.h"
> >>> #include "symbol.h"
> >>> +#include "libdw.h"
> >>>
> >>> #include <inttypes.h>
> >>> #include <string.h>
> >>> @@ -120,6 +121,10 @@ static int addr2line(const char *dso_name, u64 addr, char **file, unsigned int *
> >>> {
> >>> int ret;
> >>>
> >>> + ret = libdw__addr2line(dso_name, addr, file, line_nr, dso, unwind_inlines, node, sym);
> >>> + if (ret > 0)
> >>> + return ret;
> >>> +
> >>> ret = llvm__addr2line(dso_name, addr, file, line_nr, dso, unwind_inlines, node, sym);
> >>> if (ret > 0)
> >>> return ret;
> >>> --
> >>> 2.52.0.rc2.455.g230fcf2819-goog
> >>>
>
Powered by blists - more mailing lists