lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <20240326155247.GJZgLvT_AZi3XPPpBM@fat_crate.local>
Date: Tue, 26 Mar 2024 16:52:47 +0100
From: Borislav Petkov <bp@...en8.de>
To: Nikolay Borisov <nik.borisov@...e.com>
Cc: Paul Menzel <pmenzel@...gen.mpg.de>,
	Thomas Gleixner <tglx@...utronix.de>,
	Peter Zijlstra <peterz@...radead.org>,
	Josh Poimboeuf <jpoimboe@...nel.org>,
	Ingo Molnar <mingo@...hat.com>,
	Dave Hansen <dave.hansen@...ux.intel.com>, x86@...nel.org,
	LKML <linux-kernel@...r.kernel.org>, Marco Elver <elver@...gle.com>,
	kasan-dev@...glegroups.com
Subject: Re: Unpatched return thunk in use. This should not happen!

On Tue, Mar 26, 2024 at 04:08:32PM +0200, Nikolay Borisov wrote:
> So the problem happens when KCSAN=y CONFIG_CONSTRUCTORS is also enabled and
> this results in an indirect call in do_mod_ctors():
> 
>    mod->ctors[i]();
> 
> 
> When KCSAN is disabled, do_mod_ctors is empty, hence the warning is not
> printed.

Yeah, KCSAN is doing something weird. I was able to stop the guest when
the warning fires. Here's what I see:

The callstack when it fires:

#0  warn_thunk_thunk () at arch/x86/entry/entry.S:48
#1  0xffffffff811a98f9 in do_mod_ctors (mod=0xffffffffa00052c0) at kernel/module/main.c:2462
#2  do_init_module (mod=mod@...ry=0xffffffffa00052c0) at kernel/module/main.c:2535
#3  0xffffffff811ad2e1 in load_module (info=info@...ry=0xffffc900004c7dd0, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3001
#4  0xffffffff811ad8ef in init_module_from_file (f=f@...ry=0xffff8880151c5d00, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3168
#5  0xffffffff811adade in idempotent_init_module (f=f@...ry=0xffff8880151c5d00, uargs=uargs@...ry=0x564c103dd4a0 "", flags=flags@...ry=0) at kernel/module/main.c:3185
#6  0xffffffff811adec9 in __do_sys_finit_module (flags=0, uargs=0x564c103dd4a0 "", fd=3) at kernel/module/main.c:3206
#7  __se_sys_finit_module (flags=<optimized out>, uargs=94884689990816, fd=3) at kernel/module/main.c:3189
#8  __x64_sys_finit_module (regs=<optimized out>) at kernel/module/main.c:3189
#9  0xffffffff81fccdff in do_syscall_x64 (nr=<optimized out>, regs=0xffffc900004c7f58) at arch/x86/entry/common.c:52
#10 do_syscall_64 (regs=0xffffc900004c7f58, nr=<optimized out>) at arch/x86/entry/common.c:83
#11 0xffffffff82000126 in entry_SYSCALL_64 () at arch/x86/entry/entry_64.S:120
#12 0x0000000000000000 in ?? ()

Now, when we look at frame #1:

ffffffff811a9800 <do_init_module>:
ffffffff811a9800:       e8 bb 36 ee ff          call   ffffffff8108cec0 <__fentry__>
ffffffff811a9805:       41 57                   push   %r15
ffffffff811a9807:       41 56                   push   %r14
ffffffff811a9809:       41 55                   push   %r13
ffffffff811a980b:       41 54                   push   %r12
ffffffff811a980d:       55                      push   %rbp
ffffffff811a980e:       53                      push   %rbx
ffffffff811a980f:       48 89 fb                mov    %rdi,%rbx
ffffffff811a9812:       48 c7 c7 c8 9f 6a 82    mov    $0xffffffff826a9fc8,%rdi
ffffffff811a9819:       48 83 ec 08             sub    $0x8,%rsp
ffffffff811a981d:       e8 5e 51 0d 00          call   ffffffff8127e980 <__tsan_read8>
ffffffff811a9822:       48 8b 3d 9f 07 50 01    mov    0x150079f(%rip),%rdi        # ffffffff826a9fc8 <kmalloc_caches+0x28>

..

ffffffff811a98ec:       e8 8f 50 0d 00          call   ffffffff8127e980 <__tsan_read8>
ffffffff811a98f1:       49 8b 07                mov    (%r15),%rax
ffffffff811a98f4:       e8 27 d1 e3 00          call   ffffffff81fe6a20 <__x86_indirect_thunk_array>
ffffffff811a98f9:       4c 89 ef                mov    %r13,%rdi

there's that call to the indirect array. Which is in the static kernel image:

ffffffff81fe6a20 <__x86_indirect_thunk_array>:
ffffffff81fe6a20:       e8 01 00 00 00          call   ffffffff81fe6a26 <__x86_indirect_thunk_array+0x6>
ffffffff81fe6a25:       cc                      int3
ffffffff81fe6a26:       48 89 04 24             mov    %rax,(%rsp)
ffffffff81fe6a2a:       e9 b1 07 00 00          jmp    ffffffff81fe71e0 <__x86_return_thunk>

where you'd think, ah, yes, that's why it fires.

BUT! The live kernel image in gdb looks like this:

Dump of assembler code for function __x86_indirect_thunk_array:
   0xffffffff81fe6a20 <+0>:     call   0xffffffff81fe6a26 <__x86_indirect_thunk_array+6>
   0xffffffff81fe6a25 <+5>:     int3 
   0xffffffff81fe6a26 <+6>:     mov    %rax,(%rsp)
   0xffffffff81fe6a2a <+10>:    jmp    0xffffffff81fe70a0 <srso_return_thunk>

so the right thunk is already there!

And yet, the warning still fired.

I need to singlestep this whole loading bit more carefully.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ