[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yrr4EHWlNJx3XW/K@xpf.sh.intel.com>
Date: Tue, 28 Jun 2022 20:46:08 +0800
From: Pengfei Xu <pengfei.xu@...el.com>
To: Peter Zijlstra <peterz@...radead.org>
Cc: Zijlstra Peter <peter.zijlstra@...el.com>,
Su Heng <heng.su@...el.com>, linux-kernel@...r.kernel.org,
Josh Poimboeuf <jpoimboe@...hat.com>,
Steven Rostedt <rostedt@...dmis.org>
Subject: Re: There was missing ENDBR BUG in 5.19-rc3 mainline kernel on TGL-U
Hi Peter,
On 2022-06-28 at 12:57:41 +0200, Peter Zijlstra wrote:
> On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:
> > Hi Peter,
> >
> > Greeting!
> >
> > We found one "missing ENDBR BUG" on 5.19-rc3 kernel.
> >
> > Platform: TGL-U
> > Kernel: 5.19-rc3 mainline
> >
> > 1. Boot up TGL-U
> > 2. Execute kernel self-test shell script "ftracetest" in
> > kernel_source/tools/testing/selftests/ftrace/
> > # ./ftracetest
> > === Ftrace unit tests ===
> > [1] Basic trace file check [PASS]
> > [2] Basic test for tracers [PASS]
> > [3] Basic trace clock test [PASS]
> > [4] Basic event tracing check [PASS]
> > [5] Change the ringbuffer size [PASS]
> > [6] Snapshot and tracing setting [PASS]
> > [7] trace_pipe and trace_marker [PASS]
> > [8] Test ftrace direct functions against tracers [UNRESOLVED]
> > [9] Test ftrace direct functions against kprobes [UNRESOLVED]
> > [10] Generic dynamic event - add/remove eprobe events [FAIL]
> > [11] Generic dynamic event - add/remove kprobe events
> >
> > It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> > "
> > [ 9332.752836] mmiotrace: enabled CPU7.
> > [ 9332.788612] mmiotrace: disabled.
> > [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0
> > [ 9337.103442] ------------[ cut here ]------------
> > [ 9337.103444] kernel BUG at arch/x86/kernel/traps.c:253!
> > [ 9337.103452] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > ...
> > [ 9337.103506] Call Trace:
> > ...
> > [ 9337.103512] asm_exc_control_protection+0x30/0x40
> > ...
> > [ 9337.103540] ? trace_module_has_bad_taint+0x20/0x20
> > [ 9337.103547] ? tracepoint_add_func+0x15f/0x360
> > [ 9337.103551] ? perf_syscall_enter+0x1f0/0x1f0
> > [ 9337.103556] tracepoint_probe_register_prio+0x5c/0x90
> > [ 9337.103560] ? perf_syscall_enter+0x1f0/0x1f0
> > "
> >
> > Dmesg was in attached.
> > Do I need to do something further for this problem?
>
> your .config would perhaps have been useful... and a Cc to lkml.
>
> defconfig + kvm_guest.config + x86_debug.config + X86_KERNEL_IBT + lot
> of tracing options gets me:
>
> $ ./scripts/objdump-func defconfig-build/vmlinux.o syscall_regfunc
> 0000 0000000000181120 <syscall_regfunc>:
> 0000 181120: f3 0f 1e fa endbr64
> ...
>
> So the function does have an ENDBR on for me. Now the other possibility
> is that that ENDBR got scribbled by the sealing.
>
> $ readelf -Wa defconfig-build/vmlinux.o | awk '/Relocation section.*ibt_endbr_seal/ { P=1 } /^$/ { if (P) exit } { if (P) print $0 }' | grep 181120
> 00000000000022a8 0000000200000002 R_X86_64_PC32 0000000000000000 .text + 181120
>
> And yes, that's it. So objtool somehow misses that the address of this
> function is taken.
>
> If we grep around:
>
> $ git grep syscall_regfunc
> include/linux/tracepoint.h:extern int syscall_regfunc(void);
> include/trace/events/syscalls.h: syscall_regfunc, syscall_unregfunc
> include/trace/events/syscalls.h: syscall_regfunc, syscall_unregfunc
> kernel/tracepoint.c:int syscall_regfunc(void)
>
> we find it is only used in tracepoints, which then suggests the
> following patch:
>
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index 864bb9dd3584..57153e00349c 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -3826,8 +3826,7 @@ static int validate_ibt(struct objtool_file *file)
> !strcmp(sec->name, "__bug_table") ||
> !strcmp(sec->name, "__ex_table") ||
> !strcmp(sec->name, "__jump_table") ||
> - !strcmp(sec->name, "__mcount_loc") ||
> - !strcmp(sec->name, "__tracepoints"))
> + !strcmp(sec->name, "__mcount_loc"))
> continue;
>
> list_for_each_entry(reloc, &sec->reloc->reloc_list, list)
>
> And that does indeed seems to do the trick!
Thanks a lot for the debug method you shared, it looks cool and I could learn
from it.
And seems you have found the root cause of the problem.
Anyway, I attached the kconfig as your suggestion.
--Pengfei
BR.
Thanks!
View attachment "config-5.19.0-rc4-ibtk" of type "text/plain" (279431 bytes)
Powered by blists - more mailing lists