[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <CAP-5=fWLgaWsv82dcPajVk=UmBbmwyEd7OVp6psZQ4TiXh-Meg@mail.gmail.com>
Date: Mon, 17 Nov 2025 08:58:29 -0800
From: Ian Rogers <irogers@...gle.com>
To: Fangrui Song <maskray@...rceware.org>
Cc: Namhyung Kim <namhyung@...nel.org>, Arnaldo Carvalho de Melo <acme@...nel.org>,
James Clark <james.clark@...aro.org>, Jiri Olsa <jolsa@...nel.org>,
Adrian Hunter <adrian.hunter@...el.com>, Peter Zijlstra <peterz@...radead.org>,
Ingo Molnar <mingo@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
linux-perf-users@...r.kernel.org, Eric Biggers <ebiggers@...nel.org>,
Pablo Galindo <pablogsal@...il.com>
Subject: Re: [PATCH 1/2] perf jitdump: Add load_addr to build-ID generation
On Sat, Nov 15, 2025 at 11:21 PM Fangrui Song <maskray@...rceware.org> wrote:
>
> On Fri, Nov 14, 2025 at 11:33 AM Ian Rogers <irogers@...gle.com> wrote:
> >
> > On Fri, Nov 14, 2025 at 10:57 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > >
> > > On Fri, Nov 14, 2025 at 09:33:29AM -0800, Ian Rogers wrote:
> > > > On Fri, Nov 14, 2025 at 1:29 AM Namhyung Kim <namhyung@...nel.org> wrote:
> > > > >
> > > > > It was reported that python backtrace with JIT dump was broken after the
> > > > > change to built-in SHA-1 implementation. It seems python generates the
> > > > > same JIT code for each function. They will become separate DSOs but the
> > > > > contents are the same. Only difference is in the symbol name.
> > > > >
> > > > > But this caused a problem that every JIT'ed DSOs will have the same
> > > > > build-ID which makes perf confused. And it resulted in no python
> > > > > symbols (from JIT) in the output.
> > > >
> > > > The lookup of a DSO involves the build ID and the filename. I'm
> > > > confused as to why things weren't deduplicated and why no symbols
> > > > rather than repeatedly the same symbol?
> > >
> > > I don't know, but that's the symptom in the original bug report in the
> > > python github (see Links: below). I guess the behavior is
> > > non-deterministic.
> > >
> > > >
> > > > > Looking back at the original code before the conversion, it used the
> > > > > load_addr as well as the code section to distinguish each DSO. I think
> > > > > we should do the same or use symbol table as an additional input for
> > > > > SHA-1.
> > > >
> > > > Hmm.. the build ID for the contents of the code should be a constant.
> > > > As the build ID is a note for the entire ELF file then something is
> > > > wrong with the filename handling it seems.
> > >
> > > When it tries to load symbols from a DSO, it prefer reading from the
> > > build-ID cache than the file system since it trusts build-IDs more than
> > > the path name. See dso__load() and binary_type_symtab[].
> > >
> > > So having multiple DSO's with the same build-ID can be a problem if they
> > > are in the build-ID cache. Normally `perf inject -j` won't add the new
> > > JIT-ed DSOs to the build-ID cache but it's still possible.
> >
> > +Fangrui
> >
> > I'm surprised that build IDs don't include symbol names but:
> > ```
> > $ cat a.s
> > .text
> > .global main
> > .global foo
> > main:
> > foo:
> > ret
> > $ cat b.s
> > .text
> > .global main
> > .global bar
> > main:
> > bar:
> > ret
> > $ gcc -Wl,--build-id a.s -o a.out
> > $ gcc -Wl,--build-id b.s -o b.out
> > $ readelf -n a.out
> > ...
> > Build ID: 9dd0371b953db5d72929af5d98552e4ee1043616
> > ...
> > $ readelf -n b.out
> > ...
> > Build ID: 9dd0371b953db5d72929af5d98552e4ee1043616
> > ...
> > ```
> > so ugh. Perhaps we need to have jitdump make a single object file (and
> > so 1 build ID) but with multiple unique symbols.
> >
> > Thanks,
> > Ian
>
> Looks like a GNU ld issue. Other linkers should give different build IDs.
>
> .symtab/.strtab/.shstrtab are special sections and BFD seem to skip
> their content for the build ID.
> Raised https://sourceware.org/bugzilla/show_bug.cgi?id=33633 ("ld
> --build-id does not use symtab/strtab content")
Thanks Fangrui!
I repeated my test above and found for GNU ld and bfd the build IDs
didn't vary, but for gold and lld they do vary.
So it looks like in perf we need to update the sha1 calculation:
https://web.git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools-next.git/tree/tools/perf/util/genelf.c?h=perf-tools-next#n397
to include at least the symbol information.
I don't know if it is specified how the build ID should be calculated.
It seems like it should be a specified thing to stop tampering with
files, but I also see each linker produce a different value for my
test. I'm wondering if the sha1 calculation should really be including
other sections in the jitdump generated ELF files? For example, the
.eh_frame could be useful as that has unwinding implications.
Thanks,
Ian
> >
> >
> >
> > > Thanks,
> > > Namhyung
> > >
> > > >
> > > > > This patch is a quick-and-dirty fix just to add each byte of the
> > > > > load_addr to the first 8 bytes of SHA-1 result. Probably we need to add
> > > > > sha1_update() or similar to update the existing hash value and use it
> > > > > here. I'd like something that can be backported to the stable trees
> > > > > easily.
> > > > >
> > > > > Fixes: e3f612c1d8f3945b ("perf genelf: Remove libcrypto dependency and use built-in sha1()")
> > > > > Cc: Eric Biggers <ebiggers@...nel.org>
> > > > > Cc: Pablo Galindo <pablogsal@...il.com>
> > > > > Link: https://github.com/python/cpython/issues/139544
> > > > > Signed-off-by: Namhyung Kim <namhyung@...nel.org>
> > > > > ---
> > > > > tools/perf/util/genelf.c | 9 +++++++++
> > > > > 1 file changed, 9 insertions(+)
> > > > >
> > > > > diff --git a/tools/perf/util/genelf.c b/tools/perf/util/genelf.c
> > > > > index 591548b10e34ef6a..a412e6faf70e37f3 100644
> > > > > --- a/tools/perf/util/genelf.c
> > > > > +++ b/tools/perf/util/genelf.c
> > > > > @@ -395,6 +395,15 @@ jit_write_elf(int fd, uint64_t load_addr __maybe_unused, const char *sym,
> > > > > * build-id generation
> > > > > */
> > > > > sha1(code, csize, bnote.build_id);
> > > > > + /* FIXME: update the SHA-1 hash using additional contents */
> > > > > + bnote.build_id[0] += (load_addr >> 0) & 0xff;
> > > > > + bnote.build_id[1] += (load_addr >> 8) & 0xff;
> > > > > + bnote.build_id[2] += (load_addr >> 16) & 0xff;
> > > > > + bnote.build_id[3] += (load_addr >> 24) & 0xff;
> > > > > + bnote.build_id[4] += (load_addr >> 32) & 0xff;
> > > > > + bnote.build_id[5] += (load_addr >> 40) & 0xff;
> > > > > + bnote.build_id[6] += (load_addr >> 48) & 0xff;
> > > > > + bnote.build_id[7] += (load_addr >> 56) & 0xff;
> > > > > bnote.desc.namesz = sizeof(bnote.name); /* must include 0 termination */
> > > > > bnote.desc.descsz = sizeof(bnote.build_id);
> > > > > bnote.desc.type = NT_GNU_BUILD_ID;
> > > > > --
> > > > > 2.52.0.rc1.455.g30608eb744-goog
> > > > >
Powered by blists - more mailing lists