[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <Yrr4EHWlNJx3XW/K@xpf.sh.intel.com>
Date:   Tue, 28 Jun 2022 20:46:08 +0800
From:   Pengfei Xu <pengfei.xu@...el.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Zijlstra Peter <peter.zijlstra@...el.com>,
        Su Heng <heng.su@...el.com>, linux-kernel@...r.kernel.org,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Steven Rostedt <rostedt@...dmis.org>
Subject: Re: There was missing ENDBR BUG in 5.19-rc3 mainline kernel on TGL-U
Hi Peter,
On 2022-06-28 at 12:57:41 +0200, Peter Zijlstra wrote:
> On Tue, Jun 28, 2022 at 04:28:58PM +0800, Pengfei Xu wrote:
> > Hi Peter,
> > 
> >   Greeting!
> > 
> >   We found one "missing ENDBR BUG" on 5.19-rc3 kernel.
> > 
> >   Platform: TGL-U
> >   Kernel: 5.19-rc3 mainline
> > 
> >   1. Boot up TGL-U
> >   2. Execute kernel self-test shell script "ftracetest" in
> >      kernel_source/tools/testing/selftests/ftrace/
> > # ./ftracetest
> > === Ftrace unit tests ===
> > [1] Basic trace file check      [PASS]
> > [2] Basic test for tracers      [PASS]
> > [3] Basic trace clock test      [PASS]
> > [4] Basic event tracing check   [PASS]
> > [5] Change the ringbuffer size  [PASS]
> > [6] Snapshot and tracing setting        [PASS]
> > [7] trace_pipe and trace_marker [PASS]
> > [8] Test ftrace direct functions against tracers        [UNRESOLVED]
> > [9] Test ftrace direct functions against kprobes        [UNRESOLVED]
> > [10] Generic dynamic event - add/remove eprobe events   [FAIL]
> > [11] Generic dynamic event - add/remove kprobe events
> > 
> > It 100% reproduced in step 11 and then missing ENDBR BUG generated:
> > "
> > [ 9332.752836] mmiotrace: enabled CPU7.
> > [ 9332.788612] mmiotrace: disabled.
> > [ 9337.103426] traps: Missing ENDBR: syscall_regfunc+0x0/0xb0
> > [ 9337.103442] ------------[ cut here ]------------
> > [ 9337.103444] kernel BUG at arch/x86/kernel/traps.c:253!
> > [ 9337.103452] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > ...
> > [ 9337.103506] Call Trace:
> > ...
> > [ 9337.103512]  asm_exc_control_protection+0x30/0x40
> > ...
> > [ 9337.103540]  ? trace_module_has_bad_taint+0x20/0x20
> > [ 9337.103547]  ? tracepoint_add_func+0x15f/0x360
> > [ 9337.103551]  ? perf_syscall_enter+0x1f0/0x1f0
> > [ 9337.103556]  tracepoint_probe_register_prio+0x5c/0x90
> > [ 9337.103560]  ? perf_syscall_enter+0x1f0/0x1f0
> > "
> > 
> > Dmesg was in attached.
> > Do I need to do something further for this problem?
> 
> your .config would perhaps have been useful... and a Cc to lkml.
> 
> defconfig + kvm_guest.config + x86_debug.config + X86_KERNEL_IBT + lot
> of tracing options gets me:
> 
> $ ./scripts/objdump-func defconfig-build/vmlinux.o syscall_regfunc
> 0000 0000000000181120 <syscall_regfunc>:
> 0000   181120:  f3 0f 1e fa             endbr64
> ...
> 
> So the function does have an ENDBR on for me. Now the other possibility
> is that that ENDBR got scribbled by the sealing.
> 
> $ readelf -Wa defconfig-build/vmlinux.o | awk '/Relocation section.*ibt_endbr_seal/ { P=1 } /^$/ { if (P) exit } { if (P) print $0 }' | grep 181120
> 00000000000022a8  0000000200000002 R_X86_64_PC32          0000000000000000 .text + 181120
> 
> And yes, that's it. So objtool somehow misses that the address of this
> function is taken.
> 
> If we grep around:
> 
> $ git grep syscall_regfunc
> include/linux/tracepoint.h:extern int syscall_regfunc(void);
> include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
> include/trace/events/syscalls.h:        syscall_regfunc, syscall_unregfunc
> kernel/tracepoint.c:int syscall_regfunc(void)
> 
> we find it is only used in tracepoints, which then suggests the
> following patch:
> 
> diff --git a/tools/objtool/check.c b/tools/objtool/check.c
> index 864bb9dd3584..57153e00349c 100644
> --- a/tools/objtool/check.c
> +++ b/tools/objtool/check.c
> @@ -3826,8 +3826,7 @@ static int validate_ibt(struct objtool_file *file)
>  		    !strcmp(sec->name, "__bug_table")			||
>  		    !strcmp(sec->name, "__ex_table")			||
>  		    !strcmp(sec->name, "__jump_table")			||
> -		    !strcmp(sec->name, "__mcount_loc")			||
> -		    !strcmp(sec->name, "__tracepoints"))
> +		    !strcmp(sec->name, "__mcount_loc"))
>  			continue;
>  
>  		list_for_each_entry(reloc, &sec->reloc->reloc_list, list)
> 
> And that does indeed seems to do the trick!
Thanks a lot for the debug method you shared, it looks cool and I could learn
from it.
And seems you have found the root cause of the problem.
Anyway, I attached the kconfig as your suggestion.
--Pengfei
BR.
Thanks!
View attachment "config-5.19.0-rc4-ibtk" of type "text/plain" (279431 bytes)
Powered by blists - more mailing lists
 
