lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Z3b-DqBMnNb4ucEm@google.com>
Date: Thu, 2 Jan 2025 12:58:54 -0800
From: Namhyung Kim <namhyung@...nel.org>
To: Arnaldo Carvalho de Melo <acme@...nel.org>
Cc: Christophe Leroy <christophe.leroy@...roup.eu>,
	Adrian Hunter <adrian.hunter@...el.com>,
	Ian Rogers <irogers@...gle.com>,
	James Clark <james.clark@...aro.org>, Jiri Olsa <jolsa@...nel.org>,
	Kan Liang <kan.liang@...ux.intel.com>,
	Linux Kernel Mailing List <linux-kernel@...r.kernel.org>,
	linux-perf-users@...r.kernel.org
Subject: Re: [BUG] perf top reports not being able to resolve kernel symbols

Hi Arnaldo,

On Thu, Jan 02, 2025 at 04:51:06PM -0300, Arnaldo Carvalho de Melo wrote:
> On Thu, Jan 02, 2025 at 04:25:07PM -0300, Arnaldo Carvalho de Melo wrote:
> > root@...ber:~# readelf -sw /lib/modules/6.13.0-rc2/build/vmlinux | grep -B5 -A5 ' 0000000001600'
> > 259227: ffffffff8156e290   262 FUNC    GLOBAL DEFAULT    1 zs_free
> > 259228: ffffffff8183a4d0   269 FUNC    GLOBAL DEFAULT    1 security_inode_g[...]
> > 259229: ffffffff81c8d900   191 FUNC    GLOBAL DEFAULT    1 devres_find
> > 259230: ffffffff812e11c0    16 FUNC    GLOBAL DEFAULT    1 __pfx___probestu[...]
> > 259231: ffffffff81c985a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_pm_qos_sys[...]
> > 259232: 0000000001600000     0 NOTYPE  GLOBAL DEFAULT  ABS text_size
> > 259233: ffffffff81487f10   117 FUNC    GLOBAL DEFAULT    1 shmem_read_folio_gfp
> > 259234: ffffffff81e08540   155 FUNC    GLOBAL DEFAULT    1 __traceiter_smbu[...]
> > 259235: ffffffff811e13a0    16 FUNC    GLOBAL DEFAULT    1 __pfx_thaw_workqueues
> > 259236: ffffffff81b04c70   599 FUNC    GLOBAL DEFAULT    1 acpi_install_method
> > 259237: ffffffff81de7d40    16 FUNC    GLOBAL DEFAULT    1 __pfx_psmouse_se[...]
> > root@...ber:~#
>  
> > There it is, that "text_size" symbol stayed with with a prev->end equal
> > to prev->start and thus 0x00000000016001c1 stops being resolved, which
> > leads us to get to that buggy warning.
>  
> > I'll put all this into a patch and send it for review,
> 
> But looking further, where do those 0x00000000016001c1 addresses coming
> from?
> 
> (gdb) p /x sample->ip
> $10 = 0xffffffffb7401fad
> (gdb) p /x al->addr
> $11 = 0x1601fad
> (gdb) bt
> #0  perf_event__process_sample (tool=0x7fffffff9bd0, event=0x1017400, evsel=0xf68860, sample=0x7fff8dffa470, machine=0xf8e818) at builtin-top.c:813
> #1  0x0000000000447c5c in deliver_event (qe=0x7fffffff9ee8, qevent=0x1024670) at builtin-top.c:1213
> #2  0x0000000000642706 in do_flush (oe=0x7fffffff9ee8, show_progress=false) at util/ordered-events.c:245
> #3  0x0000000000642a5d in __ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP, timestamp=0) at util/ordered-events.c:324
> #4  0x0000000000642b47 in ordered_events__flush (oe=0x7fffffff9ee8, how=OE_FLUSH__TOP) at util/ordered-events.c:342
> #5  0x00000000004477e9 in process_thread (arg=0x7fffffff9bd0) at builtin-top.c:1125
> #6  0x00007ffff6ea5d97 in start_thread () from /lib64/libc.so.6
> #7  0x00007ffff6f29c8c in clone3 () from /lib64/libc.so.6
> (gdb)
> 
> root@...ber:~# grep ffffffffb7401f /proc/kallsyms 
> ffffffffb7401f09 t repeat_nmi
> ffffffffb7401f2e t end_repeat_nmi
> ffffffffb7401f81 t nmi_no_fsgsbase
> ffffffffb7401f85 t nmi_swapgs
> ffffffffb7401f88 t nmi_restore
> ffffffffb7401fb0 T entry_SYSCALL32_ignore
> ffffffffb7401fd0 T __pfx_clear_bhb_loop
> ffffffffb7401fe0 T clear_bhb_loop
> root@...ber:~# 
> 
> Looks like nmi_restore...
> 
> Which is...
> 
>    780: ffffffff82401ee8     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi_out
>    781: ffffffff82401ed0     0 NOTYPE  LOCAL  DEFAULT    1 nested_nmi
>    782: ffffffff82401eeb     0 NOTYPE  LOCAL  DEFAULT    1 first_nmi
>    783: ffffffff82401f81     0 NOTYPE  LOCAL  DEFAULT    1 nmi_no_fsgsbase
>    784: ffffffff82401f88     0 NOTYPE  LOCAL  DEFAULT    1 nmi_restore
>    785: ffffffff82401f85     0 NOTYPE  LOCAL  DEFAULT    1 nmi_swapgs
>    786: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS syscall_64.c
>    787: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS common.c
>    788: ffffffff810cc2b0    16 FUNC    LOCAL  DEFAULT    1 ia32_emulation_o[...]
>    789: ffffffff821e57f0   241 FUNC    LOCAL  DEFAULT    1 __do_fast_syscall_32
> 
> So there are symbols that are not being resolved anymore that were
> before your patch, namely:
> 
> arch/x86/entry/entry_64.S
> 
> nmi_no_fsgsbase:
>         /* EBX == 0 -> invoke SWAPGS */
>         testl   %ebx, %ebx
>         jnz     nmi_restore
> 
> nmi_swapgs:
>         swapgs
> 
> nmi_restore:
>         POP_REGS
> 

Sorry about that, maybe I should've done this instead.  Can you check
if it works correctly?

Thanks,
Namhyung

---8<---

>From 3130ee711d28f6e280d4bf04bdacca094657bb99 Mon Sep 17 00:00:00 2001
From: Namhyung Kim <namhyung@...nel.org>
Date: Thu, 2 Jan 2025 12:32:51 -0800
Subject: [PATCH] perf symbol: Prefer non-label symbols with same address

When there are more than one symbols at the same address, it needs to
choose which one is better.  In choose_best_symbol() it didn't check the
type of symbols.  It's possible to have labels in other symbols and in
that case, it would be better to pick the actual symbol over the labels.
To minimize the possible impact on other symbols, I only check NOTYPE
symbols specifically.

  $ readelf -sW vmlinux | grep -e __do_softirq -e __softirqentry_text_start
  105089: ffffffff82000000   814 FUNC    GLOBAL DEFAULT    1 __do_softirq
  111954: ffffffff82000000     0 NOTYPE  GLOBAL DEFAULT    1 __softirqentry_text_start

The commit 77b004f4c5c3c90b tried to do the same by not giving the size
to the label symbols but it seems there's some label-only symbols in asm
code.  Let's restore the original code and choose the right symbol using
type of the symbols.

Fixes: 77b004f4c5c3c90b ("perf symbol: Do not fixup end address of labels")
Signed-off-by: Namhyung Kim <namhyung@...nel.org>
---
 tools/perf/util/symbol.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 0037f11639195dbf..49b08adc6ee34365 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -154,6 +154,13 @@ static int choose_best_symbol(struct symbol *syma, struct symbol *symb)
 	else if ((a == 0) && (b > 0))
 		return SYMBOL_B;
 
+	if (syma->type != symb->type) {
+		if (syma->type == STT_NOTYPE)
+			return SYMBOL_B;
+		if (symb->type == STT_NOTYPE)
+			return SYMBOL_A;
+	}
+
 	/* Prefer a non weak symbol over a weak one */
 	a = syma->binding == STB_WEAK;
 	b = symb->binding == STB_WEAK;
@@ -257,7 +264,7 @@ void symbols__fixup_end(struct rb_root_cached *symbols, bool is_kallsyms)
 		 * like in:
 		 *   ffffffffc1937000 T hdmi_driver_init  [snd_hda_codec_hdmi]
 		 */
-		if (prev->end == prev->start && prev->type != STT_NOTYPE) {
+		if (prev->end == prev->start) {
 			const char *prev_mod;
 			const char *curr_mod;
 
-- 
2.47.1.613.gc27f4b7a9f-goog


Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ