[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <80582244-8c1c-4eb4-8881-db68a1428817@suse.com>
Date: Tue, 26 Mar 2024 18:04:26 +0200
From: Nikolay Borisov <nik.borisov@...e.com>
To: Borislav Petkov <bp@...en8.de>
Cc: Paul Menzel <pmenzel@...gen.mpg.de>, Thomas Gleixner
<tglx@...utronix.de>, Peter Zijlstra <peterz@...radead.org>,
Josh Poimboeuf <jpoimboe@...nel.org>, Ingo Molnar <mingo@...hat.com>,
Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
LKML <linux-kernel@...r.kernel.org>, Marco Elver <elver@...gle.com>,
kasan-dev@...glegroups.com
Subject: Re: Unpatched return thunk in use. This should not happen!
On 26.03.24 г. 17:52 ч., Borislav Petkov wrote:
> On Tue, Mar 26, 2024 at 04:08:32PM +0200, Nikolay Borisov wrote:
>> So the problem happens when KCSAN=y CONFIG_CONSTRUCTORS is also enabled and
>> this results in an indirect call in do_mod_ctors():
>>
>> mod->ctors[i]();
>>
>>
>> When KCSAN is disabled, do_mod_ctors is empty, hence the warning is not
>> printed.
>
> Yeah, KCSAN is doing something weird. I was able to stop the guest when
> the warning fires. Here's what I see:
>
> The callstack when it fires:
>
> #0 warn_thunk_thunk () at arch/x86/entry/entry.S:48
> #1 0xffffffff811a98f9 in do_mod_ctors (mod=0xffffffffa00052c0) at kernel/module/main.c:2462
> #2 do_init_module (mod=mod@...ry=0xffffffffa00052c0) at kernel/module/main.c:2535
> #3 0xffffffff811ad2e1 in load_module (info=info@...ry=0xffffc900004c7dd0, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3001
> #4 0xffffffff811ad8ef in init_module_from_file (f=f@...ry=0xffff8880151c5d00, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3168
> #5 0xffffffff811adade in idempotent_init_module (f=f@...ry=0xffff8880151c5d00, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3185
> #6 0xffffffff811adec9 in __do_sys_finit_module (flags=0, uargs=0x564c103dd4a0 "", fd=3) at kernel/module/main.c:3206
> #7 __se_sys_finit_module (flags=<optimized out>, uargs=94884689990816, fd=3) at kernel/module/main.c:3189
> #8 __x64_sys_finit_module (regs=<optimized out>) at kernel/module/main.c:3189
> #9 0xffffffff81fccdff in do_syscall_x64 (nr=<optimized out>, regs=0xffffc900004c7f58) at arch/x86/entry/common.c:52
> #10 do_syscall_64 (regs=0xffffc900004c7f58, nr=<optimized out>) at arch/x86/entry/common.c:83
> #11 0xffffffff82000126 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
> #12 0x0000000000000000 in ?? ()
>
> Now, when we look at frame #1:
>
> ffffffff811a9800 <do_init_module>:
> ffffffff811a9800: e8 bb 36 ee ff call ffffffff8108cec0 <__fentry__>
> ffffffff811a9805: 41 57 push %r15
> ffffffff811a9807: 41 56 push %r14
> ffffffff811a9809: 41 55 push %r13
> ffffffff811a980b: 41 54 push %r12
> ffffffff811a980d: 55 push %rbp
> ffffffff811a980e: 53 push %rbx
> ffffffff811a980f: 48 89 fb mov %rdi,%rbx
> ffffffff811a9812: 48 c7 c7 c8 9f 6a 82 mov $0xffffffff826a9fc8,%rdi
> ffffffff811a9819: 48 83 ec 08 sub $0x8,%rsp
> ffffffff811a981d: e8 5e 51 0d 00 call ffffffff8127e980 <__tsan_read8>
> ffffffff811a9822: 48 8b 3d 9f 07 50 01 mov 0x150079f(%rip),%rdi # ffffffff826a9fc8 <kmalloc_caches+0x28>
>
> ...
>
> ffffffff811a98ec: e8 8f 50 0d 00 call ffffffff8127e980 <__tsan_read8>
> ffffffff811a98f1: 49 8b 07 mov (%r15),%rax
> ffffffff811a98f4: e8 27 d1 e3 00 call ffffffff81fe6a20 <__x86_indirect_thunk_array>
> ffffffff811a98f9: 4c 89 ef mov %r13,%rdi
>
> there's that call to the indirect array. Which is in the static kernel image:
>
> ffffffff81fe6a20 <__x86_indirect_thunk_array>:
> ffffffff81fe6a20: e8 01 00 00 00 call ffffffff81fe6a26 <__x86_indirect_thunk_array+0x6>
> ffffffff81fe6a25: cc int3
> ffffffff81fe6a26: 48 89 04 24 mov %rax,(%rsp)
> ffffffff81fe6a2a: e9 b1 07 00 00 jmp ffffffff81fe71e0 <__x86_return_thunk>
>
> where you'd think, ah, yes, that's why it fires.
>
> BUT! The live kernel image in gdb looks like this:
>
> Dump of assembler code for function __x86_indirect_thunk_array:
> 0xffffffff81fe6a20 <+0>: call 0xffffffff81fe6a26 <__x86_indirect_thunk_array+6>
> 0xffffffff81fe6a25 <+5>: int3
> 0xffffffff81fe6a26 <+6>: mov %rax,(%rsp)
> 0xffffffff81fe6a2a <+10>: jmp 0xffffffff81fe70a0 <srso_return_thunk>
>
> so the right thunk is already there!
>
> And yet, the warning still fired.
But you eventually call the address that was in %RAX from within
srso_return_thunk, so it's likely that's where the warning is triggered.
As far as I managed to see that address is supposed to be some compiler
generated constructors that calls tsan_init. Dumping the .init_array
contains:
.type _sub_I_00099_0, @function
25 _sub_I_00099_0:
24 endbr64
23 call __tsan_init #
22 jmp __x86_return_thunk
21 .size _sub_I_00099_0, .-_sub_I_00099_0
20 .section .init_array.00099,"aw"
19 .align 8
18 .quad _sub_I_00099_0
17 .ident "GCC: (Ubuntu 12.3.0-1ubuntu1~22.04) 12.3.0"
16 .section .note.GNU-stack,"",@progbits
15 .section .note.gnu.property,"a"
14 .align 8
13 .long 1f - 0f
12 .long 4f - 1f
11 .long 5
10 0:
9 .string "GNU"
8 1:
7 .align 8
6 .long 0xc0000002
5 .long 3f - 2f
4 2:
3 .long 0x1
2 3:
1 .align 8
0 4:
So this _sub_I_00099_0 is the compiler generated ctors that is
likely not patched. What's strange is that when adding debugging code I
see that 2 ctors are being executed and only the 2nd one fires:
[ 7.635418] in do_mod_ctors
[ 7.635425] calling 0 ctor 00000000aa7a443a
[ 7.635430] called 0 ctor
[ 7.635433] calling 1 ctor 00000000fe9d0d54
[ 7.635437] ------------[ cut here ]------------
[ 7.635441] Unpatched return thunk in use. This should not happen!
>
> I need to singlestep this whole loading bit more carefully.
>
> Thx.
>
Powered by blists - more mailing lists