[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <9dd7ecce-dd4f-47a1-a7ad-bb48da8c21f2@intel.com>
Date: Fri, 17 Oct 2025 00:04:42 +0800
From: "Li, Tianyou" <tianyou.li@...el.com>
To: James Clark <james.clark@...aro.org>
CC: Peter Zijlstra <peterz@...radead.org>, Ingo Molnar <mingo@...hat.com>,
Arnaldo Carvalho de Melo <acme@...nel.org>, Mark Rutland
<mark.rutland@....com>, Alexander Shishkin
<alexander.shishkin@...ux.intel.com>, Jiri Olsa <jolsa@...nel.org>, "Ian
Rogers" <irogers@...gle.com>, Adrian Hunter <adrian.hunter@...el.com>, "Kan
Liang" <kan.liang@...ux.intel.com>, Ravi Bangoria <ravi.bangoria@....com>,
<wangyang.guo@...el.com>, <pan.deng@...el.com>, <zhiguo.zhou@...el.com>,
<jiebin.sun@...el.com>, <thomas.falcon@...el.com>, <dapeng1.mi@...el.com>,
<linux-perf-users@...r.kernel.org>, <linux-kernel@...r.kernel.org>, "Namhyung
Kim" <namhyung@...nel.org>
Subject: Re: [PATCH v2] perf tools annotate: fix a crash when annotate the
same symbol with 's' and 'T'
On 10/16/2025 11:18 PM, James Clark wrote:
>
>
> On 16/10/2025 4:04 pm, Li, Tianyou wrote:
>>
>> On 10/16/2025 9:06 PM, James Clark wrote:
>>>
>>>
>>> On 16/10/2025 4:36 am, Li, Tianyou wrote:
>>>> Hi James,
>>>>
>>>> Thanks for your time to review. Please see my comments inlined.
>>>>
>>>> Regards,
>>>>
>>>> Tianyou
>>>>
>>>> On 10/16/2025 1:30 AM, James Clark wrote:
>>>>>
>>>>>
>>>>> On 15/10/2025 6:20 pm, Tianyou Li wrote:
>>>>>> When perf report with annotation for a symbol, press 's' and 'T',
>>>>>> then exit
>>>>>> the annotate browser. Once annotate the same symbol, the annotate
>>>>>> browser
>>>>>> will crash.
>>>>>>
>>>>>> The browser.arch was required to be correctly updated when data type
>>>>>> feature was enabled by 'T'. Usually it was initialized by
>>>>>> symbol__annotate2
>>>>>> function. If a symbol has already been correctly annotated at the
>>>>>> first
>>>>>> time, it should not call the symbol__annotate2 function again,
>>>>>> thus the
>>>>>> browser.arch will not get initialized. Then at the second time to
>>>>>> show the
>>>>>> annotate browser, the data type needs to be displayed but the
>>>>>> browser.arch
>>>>>> is empty.
>>>>>>
>>>>>> Stack trace as below:
>>>>>>
>>>>>> Perf: Segmentation fault
>>>>>> -------- backtrace --------
>>>>>> #0 0x55d365 in ui__signal_backtrace setup.c:0
>>>>>> #1 0x7f5ff1a3e930 in __restore_rt libc.so.6[3e930]
>>>>>> #2 0x570f08 in arch__is perf[570f08]
>>>>>> #3 0x562186 in annotate_get_insn_location perf[562186]
>>>>>> #4 0x562626 in __hist_entry__get_data_type annotate.c:0
>>>>>> #5 0x56476d in annotation_line__write perf[56476d]
>>>>>> #6 0x54e2db in annotate_browser__write annotate.c:0
>>>>>> #7 0x54d061 in ui_browser__list_head_refresh perf[54d061]
>>>>>> #8 0x54dc9e in annotate_browser__refresh annotate.c:0
>>>>>> #9 0x54c03d in __ui_browser__refresh browser.c:0
>>>>>> #10 0x54ccf8 in ui_browser__run perf[54ccf8]
>>>>>> #11 0x54eb92 in __hist_entry__tui_annotate perf[54eb92]
>>>>>> #12 0x552293 in do_annotate hists.c:0
>>>>>> #13 0x55941c in evsel__hists_browse hists.c:0
>>>>>> #14 0x55b00f in evlist__tui_browse_hists perf[55b00f]
>>>>>> #15 0x42ff02 in cmd_report perf[42ff02]
>>>>>> #16 0x494008 in run_builtin perf.c:0
>>>>>> #17 0x494305 in handle_internal_command perf.c:0
>>>>>> #18 0x410547 in main perf[410547]
>>>>>> #19 0x7f5ff1a295d0 in __libc_start_call_main libc.so.6[295d0]
>>>>>> #20 0x7f5ff1a29680 in __libc_start_main@@GLIBC_2.34
>>>>>> libc.so.6[29680]
>>>>>> #21 0x410b75 in _start perf[410b75]
>>>>>>
>>>>>> Fixes: 1d4374afd000 ("perf annotate: Add 'T' hot key to toggle
>>>>>> data type display")
>>>>>> Reviewed-by: James Clark <james.clark@...aro.org>
>>>>>> Signed-off-by: Tianyou Li <tianyou.li@...el.com>
>>>>>> ---
>>>>>> tools/perf/ui/browsers/annotate.c | 3 +++
>>>>>> tools/perf/util/annotate.c | 2 +-
>>>>>> tools/perf/util/annotate.h | 2 ++
>>>>>> 3 files changed, 6 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/
>>>>>> browsers/annotate.c
>>>>>> index 8fe699f98542..3b27ef1e8490 100644
>>>>>> --- a/tools/perf/ui/browsers/annotate.c
>>>>>> +++ b/tools/perf/ui/browsers/annotate.c
>>>>>> @@ -1161,6 +1161,9 @@ int __hist_entry__tui_annotate(struct
>>>>>> hist_entry *he, struct map_symbol *ms,
>>>>>> if (!annotation__has_source(notes))
>>>>>> ui__warning("Annotation has no source code.");
>>>>>> }
>>>>>> + } else if (evsel__get_arch(evsel, &browser.arch)) {
>>>>>> + ui__error("Couldn't get architecture for event '%s'",
>>>>>> evsel- >name);
>>>>>> + return -1;
>>>>>> }
>>>>>
>>>>> symbol_annotate() only fails for negative return values of
>>>>> evsel__get_arch(), but evsel__get_arch() has at least two positive
>>>>> error return values.
>>>>>
>>>>> If symbol_annotate() is wrong and it should be != 0 like you have,
>>>>> then maybe symbol_annotate() should be fixed in another commit in
>>>>> the same patchset as this one. Otherwise you have two calls to the
>>>>> same thing right next to each other that handle errors differently.
>>>>
>>>>
>>>> Thanks James. I will give a try on handling the error message with
>>>> symbol__strerror_disassemble. I am conservative to change the code
>>>> in symbol_annotate, agreed it should be considered in another
>>>> patch. Would like to focus this particular issue and get it fixed
>>>> properly. Thanks.
>>>>
>>>>
>>>
>>> Looks like there was a misunderstanding. I'm not saying that the
>>> error is _reported_ differently, it's that the condition that
>>> triggers the error is different.
>>>
>>> symbol__annotate():
>>>
>>> err = evsel__get_arch(evsel, &arch);
>>> if (err < 0)
>>> return err;
>>>
>>> You added:
>>>
>>> if (evsel__get_arch(evsel, &browser.arch))
>>> ...
>>>
>>> evsel__get_arch() returns positive error values (and maybe also
>>> negative?), so "< 0" behaves differently to "!= 0".
>>>
>>> You either have to assume that "< 0" is correct and not change it,
>>> but then you have to also check the return value in the same way. Or
>>> if by doing "!= 0" you're implying that symbol__annotate() is wrong
>>> to do "< 0", then you should fix it now to not leave
>>> __hist_entry__tui_annotate() doing the same thing two different ways
>>> at different times.
>>>
>> Thanks James. I looked at the code of symbol__annotate, and noticed
>> the if (err<0) statement. I did not mean to change the code in
>> symbol__annotate because I did not understand why it handled the
>> error code that way. The positive return value of evsel__get_arch
>> indicates some error happens, eg in arm__annotate_init, so I use the
>> symbol__strerror_disassemble function to handle both positive and
>> negative error code.
>>
>> I do agree we should check the error code of evsel__get_arch, but I
>> am hesitate to touch the code which I am not sure the consequences. I
>> agree it may deserve another patch but not in this patchset if we
>> have clear answers on why "<0" is not correct, or we have a case to
>> break the current code as a evidence. Thanks.
>>
>>
>> Regards,
>>
>> Tianyou
>>
>
> It may take a little bit of effort to follow the code and look at the
> git blame to see what happened, but it's really not going to be that
> hard.
Truly appreciated for your instant response, and the suggestions about
'Fixes' tag, return value handling etc. I do check the git history about
the code "<0", I still did not quite understand the reason of handling
it in that way.
>
> You're basically suggesting to add code that (when expanded) does this:
>
> if (first_run) {
> if (do_important_thing() < 0)
> return err;
> } else { // second run
> if (do_important_thing() != 0)
> return err;
> }
>
> It's not going to help anyone who looks at it in the future. It's
> going to make future refactors of evsel__get_arch() more difficult,
> and without knowing why it's like that, it's possibly introducing
> another bug.
>
I am suggesting to focus on the 'else' part. If that part of code is
correct, then we might need to consider another patch for the "<0" code.
I am eager for the answer as well.
> It surely has to be consistent otherwise it doesn't make sense. And if
> you sent a patch that did "< 0" I would still say "but it can return
> positive errors, so the new code isn't right".
> I did suggest in the beginning to not check the error at all and add a
> comment saying it must succeed at that point because it's already done
> once before, but that's not very defensive and it doesn't fix the
> other possible bug.
>
Yes. I am not so sure 'must succeed' could be a right assumption, or for
safety it's better to check the error code.
Regards,
Tianyou
>>>>>
>>>>>> /* Copy necessary information when it's called from perf
>>>>>> top */
>>>>>> diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
>>>>>> index a2e34f149a07..39d6594850f1 100644
>>>>>> --- a/tools/perf/util/annotate.c
>>>>>> +++ b/tools/perf/util/annotate.c
>>>>>> @@ -980,7 +980,7 @@ void symbol__calc_percent(struct symbol *sym,
>>>>>> struct evsel *evsel)
>>>>>> annotation__calc_percent(notes, evsel, symbol__size(sym));
>>>>>> }
>>>>>> -static int evsel__get_arch(struct evsel *evsel, struct arch
>>>>>> **parch)
>>>>>> +int evsel__get_arch(struct evsel *evsel, struct arch **parch)
>>>>>> {
>>>>>> struct perf_env *env = evsel__env(evsel);
>>>>>> const char *arch_name = perf_env__arch(env);
>>>>>> diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
>>>>>> index eaf6c8aa7f47..d4990bff29a7 100644
>>>>>> --- a/tools/perf/util/annotate.h
>>>>>> +++ b/tools/perf/util/annotate.h
>>>>>> @@ -585,4 +585,6 @@ void debuginfo_cache__delete(void);
>>>>>> int annotation_br_cntr_entry(char **str, int br_cntr_nr, u64
>>>>>> *br_cntr,
>>>>>> int num_aggr, struct evsel *evsel);
>>>>>> int annotation_br_cntr_abbr_list(char **str, struct evsel
>>>>>> *evsel, bool header);
>>>>>> +
>>>>>> +int evsel__get_arch(struct evsel *evsel, struct arch **parch);
>>>>>> #endif /* __PERF_ANNOTATE_H */
>>>>>
>>>>>
>>>
>>>
>
>
Powered by blists - more mailing lists