[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAM9d7cgBZVfur8S3QC2woUA2C6O3Dme0YHP8PbFcwc_o0k-dWg@mail.gmail.com>
Date: Wed, 22 May 2024 10:27:06 -0700
From: Namhyung Kim <namhyung@...nel.org>
To: "Steinar H. Gunderson" <sesse@...gle.com>
Cc: acme@...nel.org, linux-perf-users@...r.kernel.org,
linux-kernel@...r.kernel.org, irogers@...gle.com
Subject: Re: [PATCH v4 1/3] perf report: Support LLVM for addr2line()
Hello,
Thanks a lot for the patches!
On Mon, May 20, 2024 at 1:31 AM Steinar H. Gunderson <sesse@...gle.com> wrote:
>
> In addition to the existing support for libbfd and calling out to
> an external addr2line command, add support for using libllvm directly.
> This is both faster than libbfd, and can be enabled in distro builds
> (the LLVM license has an explicit provision for GPLv2 compatibility).
> Thus, it is set as the primary choice if available.
>
> As an example, running perf report on a medium-size profile with
> DWARF-based backtraces took 58 seconds with LLVM, 78 seconds with
> libbfd, 153 seconds with external llvm-addr2line, and I got tired
> and aborted the test after waiting for 55 minutes with external
> bfd addr2line (which is the default for perf as compiled by distributions
> today). Evidently, for this case, the bfd addr2line process needs
> 18 seconds (on a 5.2 GHz Zen 3) to load the .debug ELF in question,
> hits the 1-second timeout and gets killed during initialization,
> getting restarted anew every time. Having an in-process addr2line
> makes this much more robust.
>
> As future extensions, libllvm can be used in many other places where
> we currently use libbfd or other libraries:
>
> - Symbol enumeration (in particular, for PE binaries).
> - Demangling (including non-Itanium demangling, e.g. Microsoft
> or Rust).
> - Disassembling (perf annotate).
I think it should support other DWARF use cases like
unwinding and type info?
>
> However, these are much less pressing; most people don't profile
> PE binaries, and perf has non-bfd paths for ELF. The same with
> demangling; the default _cxa_demangle path works fine for most
> users. Disassembling is coming in a later patch in the series;
> however do note that while bfd objdump can be slow on large binaries,
> it is possible to use --objdump=llvm-objdump to get the speed benefits.
I remember bfd objdump is sometimes faster than llvm-objdump
especially when no line numbers are requested IIRC.
> (It appears LLVM-based demangling is very simple, should we want
> that.)
>
> Tested with LLVM 14, 15, 16, 18 and 19. For some reason, LLVM 12 was not
> correctly detected using feature_check, and thus was not tested.
Anyway, nice work. Maybe we can implement other use cases
using LLVM and reduce the dependencies.
Thanks,
Namhyung
>
> Signed-off-by: Steinar H. Gunderson <sesse@...gle.com>
Powered by blists - more mailing lists