lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [day] [month] [year] [list]
Message-ID: <aPRbfdU92XRLR-2N@google.com>
Date: Sun, 19 Oct 2025 12:31:09 +0900
From: Namhyung Kim <namhyung@...nel.org>
To: "Li, Tianyou" <tianyou.li@...el.com>
Cc: James Clark <james.clark@...aro.org>,
	Peter Zijlstra <peterz@...radead.org>,
	Ingo Molnar <mingo@...hat.com>,
	Arnaldo Carvalho de Melo <acme@...nel.org>,
	Mark Rutland <mark.rutland@....com>,
	Alexander Shishkin <alexander.shishkin@...ux.intel.com>,
	Jiri Olsa <jolsa@...nel.org>, Ian Rogers <irogers@...gle.com>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Kan Liang <kan.liang@...ux.intel.com>,
	Ravi Bangoria <ravi.bangoria@....com>, wangyang.guo@...el.com,
	pan.deng@...el.com, zhiguo.zhou@...el.com, jiebin.sun@...el.com,
	thomas.falcon@...el.com, dapeng1.mi@...el.com,
	linux-perf-users@...r.kernel.org, linux-kernel@...r.kernel.org
Subject: Re: [PATCH v2] perf tools annotate: fix a crash when annotate the
 same symbol with 's' and 'T'

Hello,

On Fri, Oct 17, 2025 at 12:04:42AM +0800, Li, Tianyou wrote:
> 
> On 10/16/2025 11:18 PM, James Clark wrote:
> > 
> > 
> > On 16/10/2025 4:04 pm, Li, Tianyou wrote:
> > > 
> > > On 10/16/2025 9:06 PM, James Clark wrote:
> > > > 
> > > > 
> > > > On 16/10/2025 4:36 am, Li, Tianyou wrote:
> > > > > Hi James,
> > > > > 
> > > > > Thanks for your time to review. Please see my comments inlined.
> > > > > 
> > > > > Regards,
> > > > > 
> > > > > Tianyou
> > > > > 
> > > > > On 10/16/2025 1:30 AM, James Clark wrote:
> > > > > > 
> > > > > > 
> > > > > > On 15/10/2025 6:20 pm, Tianyou Li wrote:
> > > > > > > When perf report with annotation for a symbol, press
> > > > > > > 's' and 'T', then exit
> > > > > > > the annotate browser. Once annotate the same symbol,
> > > > > > > the annotate browser
> > > > > > > will crash.
> > > > > > > 
> > > > > > > The browser.arch was required to be correctly updated when data type
> > > > > > > feature was enabled by 'T'. Usually it was
> > > > > > > initialized by symbol__annotate2
> > > > > > > function. If a symbol has already been correctly
> > > > > > > annotated at the first
> > > > > > > time, it should not call the symbol__annotate2
> > > > > > > function again, thus the
> > > > > > > browser.arch will not get initialized. Then at the
> > > > > > > second time to show the
> > > > > > > annotate browser, the data type needs to be
> > > > > > > displayed but the browser.arch
> > > > > > > is empty.
> > > > > > > 
> > > > > > > Stack trace as below:
> > > > > > > 
> > > > > > > Perf: Segmentation fault
> > > > > > > -------- backtrace --------
> > > > > > >      #0 0x55d365 in ui__signal_backtrace setup.c:0
> > > > > > >      #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930]
> > > > > > >      #2 0x570f08 in arch__is perf[570f08]
> > > > > > >      #3 0x562186 in annotate_get_insn_location perf[562186]
> > > > > > >      #4 0x562626 in __hist_entry__get_data_type annotate.c:0
> > > > > > >      #5 0x56476d in annotation_line__write perf[56476d]
> > > > > > >      #6 0x54e2db in annotate_browser__write annotate.c:0
> > > > > > >      #7 0x54d061 in ui_browser__list_head_refresh perf[54d061]
> > > > > > >      #8 0x54dc9e in annotate_browser__refresh annotate.c:0
> > > > > > >      #9 0x54c03d in __ui_browser__refresh browser.c:0
> > > > > > >      #10 0x54ccf8 in ui_browser__run perf[54ccf8]
> > > > > > >      #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92]
> > > > > > >      #12 0x552293 in do_annotate hists.c:0
> > > > > > >      #13 0x55941c in evsel__hists_browse hists.c:0
> > > > > > >      #14 0x55b00f in evlist__tui_browse_hists perf[55b00f]
> > > > > > >      #15 0x42ff02 in cmd_report perf[42ff02]
> > > > > > >      #16 0x494008 in run_builtin perf.c:0
> > > > > > >      #17 0x494305 in handle_internal_command perf.c:0
> > > > > > >      #18 0x410547 in main perf[410547]
> > > > > > >      #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0]
> > > > > > >      #20 0x7f5ff1a29680 in
> > > > > > > __libc_start_main@@GLIBC_2.34 libc.so.6[29680]
> > > > > > >      #21 0x410b75 in _start perf[410b75]
> > > > > > > 
> > > > > > > Fixes: 1d4374afd000 ("perf annotate: Add 'T' hot key
> > > > > > > to toggle data type display")
> > > > > > > Reviewed-by: James Clark <james.clark@...aro.org>
> > > > > > > Signed-off-by: Tianyou Li <tianyou.li@...el.com>
> > > > > > > ---
> > > > > > >   tools/perf/ui/browsers/annotate.c | 3 +++
> > > > > > >   tools/perf/util/annotate.c        | 2 +-
> > > > > > >   tools/perf/util/annotate.h        | 2 ++
> > > > > > >   3 files changed, 6 insertions(+), 1 deletion(-)
> > > > > > > 
> > > > > > > diff --git a/tools/perf/ui/browsers/annotate.c
> > > > > > > b/tools/perf/ui/ browsers/annotate.c
> > > > > > > index 8fe699f98542..3b27ef1e8490 100644
> > > > > > > --- a/tools/perf/ui/browsers/annotate.c
> > > > > > > +++ b/tools/perf/ui/browsers/annotate.c
> > > > > > > @@ -1161,6 +1161,9 @@ int
> > > > > > > __hist_entry__tui_annotate(struct hist_entry *he,
> > > > > > > struct map_symbol *ms,
> > > > > > >               if (!annotation__has_source(notes))
> > > > > > >                   ui__warning("Annotation has no source code.");
> > > > > > >           }
> > > > > > > +    } else if (evsel__get_arch(evsel, &browser.arch)) {
> > > > > > > +        ui__error("Couldn't get architecture for
> > > > > > > event '%s'", evsel- >name);
> > > > > > > +        return -1;
> > > > > > >       }
> > > > > > 
> > > > > > symbol_annotate() only fails for negative return values
> > > > > > of evsel__get_arch(), but evsel__get_arch() has at least
> > > > > > two positive error return values.
> > > > > > 
> > > > > > If symbol_annotate() is wrong and it should be != 0 like
> > > > > > you have, then maybe symbol_annotate() should be fixed
> > > > > > in another commit in the same patchset as this one.
> > > > > > Otherwise you have two calls to the same thing right
> > > > > > next to each other that handle errors differently.
> > > > > 
> > > > > 
> > > > > Thanks James. I will give a try on handling the error
> > > > > message with symbol__strerror_disassemble. I am conservative
> > > > > to change the code in symbol_annotate, agreed it should be
> > > > > considered in another patch. Would like to focus this
> > > > > particular issue and get it fixed properly. Thanks.
> > > > > 
> > > > > 
> > > > 
> > > > Looks like there was a misunderstanding. I'm not saying that the
> > > > error is _reported_ differently, it's that the condition that
> > > > triggers the error is different.
> > > > 
> > > > symbol__annotate():
> > > > 
> > > >   err = evsel__get_arch(evsel, &arch);
> > > >   if (err < 0)
> > > >       return err;
> > > > 
> > > > You added:
> > > > 
> > > >   if (evsel__get_arch(evsel, &browser.arch))
> > > >      ...
> > > > 
> > > > evsel__get_arch() returns positive error values (and maybe also
> > > > negative?), so "< 0" behaves differently to "!= 0".
> > > > 
> > > > You either have to assume that "< 0" is correct and not change
> > > > it, but then you have to also check the return value in the same
> > > > way. Or if by doing "!= 0" you're implying that
> > > > symbol__annotate() is wrong to do "< 0", then you should fix it
> > > > now to not leave __hist_entry__tui_annotate() doing the same
> > > > thing two different ways at different times.
> > > > 
> > > Thanks James. I looked at the code of symbol__annotate, and noticed
> > > the if (err<0) statement. I did not mean to change the code in
> > > symbol__annotate because I did not understand why it handled the
> > > error code that way. The positive return value of evsel__get_arch
> > > indicates some error happens, eg in arm__annotate_init, so I use the
> > > symbol__strerror_disassemble function to handle both positive and
> > > negative error code.
> > > 
> > > I do agree we should check the error code of evsel__get_arch, but I
> > > am hesitate to touch the code which I am not sure the consequences.
> > > I agree it may deserve another patch but not in this patchset if we
> > > have clear answers on why "<0" is not correct, or we have a case to
> > > break the current code as a evidence. Thanks.
> > > 
> > > 
> > > Regards,
> > > 
> > > Tianyou
> > > 
> > 
> > It may take a little bit of effort to follow the code and look at the
> > git blame to see what happened, but it's really not going to be that
> > hard.
> 
> Truly appreciated for your instant response, and the suggestions about
> 'Fixes' tag, return value handling etc. I do check the git history about the
> code "<0", I still did not quite understand the reason of handling it in
> that way.

Looks like I just overlooked the error handling when I factored out the
function.  Please feel free to update symbol__annotate() to check != 0.

> 
> 
> > 
> > You're basically suggesting to add code that (when expanded) does this:
> > 
> >   if (first_run) {
> >      if (do_important_thing() < 0)
> >         return err;
> >   } else { // second run
> >      if (do_important_thing() != 0)
> >         return err;
> >   }
> > 
> > It's not going to help anyone who looks at it in the future. It's going
> > to make future refactors of evsel__get_arch() more difficult, and
> > without knowing why it's like that, it's possibly introducing another
> > bug.
> > 
> 
> I am suggesting to focus on the 'else' part. If that part of code is
> correct, then we might need to consider another patch for the "<0" code. I
> am eager for the answer as well.
> 
> 
> > It surely has to be consistent otherwise it doesn't make sense. And if
> > you sent a patch that did "< 0" I would still say "but it can return
> > positive errors, so the new code isn't right".
> > I did suggest in the beginning to not check the error at all and add a
> > comment saying it must succeed at that point because it's already done
> > once before, but that's not very defensive and it doesn't fix the other
> > possible bug.
> > 
> 
> Yes. I am not so sure 'must succeed' could be a right assumption, or for
> safety it's better to check the error code.

Agreed.  Anyway I can confirm that this patch fixed the crash.

Tested-by: Namhyung Kim <namhyung@...nel.org>

Thanks,
Namhyung

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ