lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Mon, 19 Sep 2022 19:35:42 -0700
From:   Nadav Amit <nadav.amit@...il.com>
To:     "Steven Rostedt (Google)" <rostedt@...dmis.org>,
        "Peter Zijlstra (Intel)" <peterz@...radead.org>
Cc:     LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall]
 functions

On Sep 13, 2022, at 6:52 AM, Nadav Amit <nadav.amit@...il.com> wrote:

> From: Nadav Amit <namit@...are.com>
> 
> Tracing - through ftrace function tracer and kprobes - of certain common
> syscall functions is currently disabled. Setting kprobes on these
> functions is specifically useful for debugging of syscall failures.
> 
> Such tracing is disabled since __do_sys_[syscall] functions are declared
> as "inline".  "inline" in the kernel is actually defined as a macro that
> in addition to using the inline keyword also disables tracing (notrace).
> According to the comments in the code, tracing inline functions can
> wreck havoc, which is probably true in some cases.
> 
> In practice, however, this might be too extensive. The compiler regards
> the "inline" keyword only as a hint, which it is free to ignore. In
> fact, in my builds gcc ignores the "inline" hint for many
> __do_sys_[syscall] since some of these functions are quite big and
> called from multiple locations (for compat). As a result, these
> functions cannot be traced.
> 
> There are 3 possible solutions for enabling the tracing of
> __do_sys_[syscall]:
> 
> 1. Mark __do_sys_[syscall] as __always_inline instead of inline. This
>   would increase the executable size, which might not be desired.
> 
> 2. Remove the inline hint from __do_sys_[syscall]. Again, it might
>   affect the generated code, inducing function call overhead for some
>   syscalls.
> 
> 3. Remove "notrace" from the "inline" macro definition, and require
>   functions that cannot be traced to be marked explicitly as "notrace".
>   This might be the most correct solution, which would also enable tracing
>   of additional useful functions. But finding the functions that cannot
>   be traced is not easy without some automation.
> 
> 4. Avoid the use of "notrace" specifically for __do_sys_[syscall].

Steven, Peter, assistance would be helpful now that you are hopefully less
busy.

Using kprobes, I repeatedly get various crashes such as the one below that
happen in functions that are called from do_idle(). I am not sure that I get
exactly the reason (although it seems to be related to the RCU). The traces
for some reason show that when I patch a 2 byte JCC instruction and the RIP
always points to the second byte of the instructions, which I would not
expect to ever happen.

So basically, I have two questions for you:

1. What is the reason that inline functions are marked with notrace?

2. Is probing function that is called from do_idle() supposed to work, or
   should the kernel prevent it?

Thanks,
Nadav


[ 2381.637652] BUG: unable to handle page fault for address: ffffc90077cb6e4b
[ 2381.645568] #PF: supervisor read access in kernel mode
[ 2381.651367] #PF: error_code(0x0000) - not-present page
[ 2381.657155] PGD 100000067 P4D 100000067 PUD 0 
[ 2381.662154] Oops: 0000 [#1] PREEMPT SMP NOPTI
[ 2381.667040] CPU: 19 PID: 0 Comm: swapper/19 Not tainted 5.17.0-rc4 #6
[ 2381.674749] Hardware name: Cisco Systems Inc UCSC-C220-M5SX/UCSC-C220-M5SX, BIOS C220M5.4.0.1i.0.0522190226 05/22/2019
[ 2381.686733] RIP: 0010:poke_int3_handler+0x6d/0x140
[ 2381.692108] Code: 32 31 c0 48 81 c6 00 00 00 81 48 39 f1 74 39 f0 41 ff 08 c3 31 c0 c3 49 89 d3 49 89 c2 49 d1 ea 4c 89 d2 48 c1 e2 04 4c 01 da <48> 63 32 48 81 c6 00 00 00 81 48 39 f1 0f 82 b7 00 00 00 0f 87 98
[ 2381.713139] RSP: 0000:ffffc9000c6a7c90 EFLAGS: 00010086
[ 2381.718994] RAX: 000000000d4f7e0a RBX: 0000000000000000 RCX: ffffffff81226ffb
[ 2381.726990] RDX: ffffc90077cb6e4b RSI: ffffffff82400a99 RDI: ffffc9000c6a7cb8
[ 2381.734986] RBP: ffffc9000c6a7ca8 R08: ffffc9000d4f7d84 R09: ffffffff81226ffc
[ 2381.742979] R10: 0000000006a7bf05 R11: ffffc9000d4f7dfb R12: ffffc9000c6a7cb8
[ 2381.750974] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 2381.758966] FS:  0000000000000000(0000) GS:ffff88afe3740000(0000) knlGS:0000000000000000
[ 2381.768031] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2381.774469] CR2: ffffc90077cb6e4b CR3: 000000000680a001 CR4: 00000000007706e0
[ 2381.782462] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 2381.790455] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 2381.798447] PKRU: 55555554
[ 2381.801481] Call Trace:
[ 2381.804224]  <TASK>
[ 2381.806575]  ? exc_int3+0x10/0xd0
[ 2381.810292]  asm_exc_int3+0x31/0x40
[ 2381.814205] RIP: 0010:__hrtimer_next_event_base+0x6c/0xe0
[ 2381.820260] Code: 01 00 00 00 48 8b 41 28 49 39 c6 74 67 48 c1 e6 06 48 8b 50 18 4a 2b 54 26 78 4c 39 ea 7d 08 49 89 d5 4d 85 f6 74 40 85 db cc <21> 44 89 f8 89 d9 f3 48 0f bc c9 d3 e0 89 ce 89 c9 48 83 c1 01 f7
[ 2381.841292] RSP: 0000:ffffc9000c6a7d98 EFLAGS: 00000046
[ 2381.847148] RAX: ffff88afe3763960 RBX: 0000000000000000 RCX: ffff88afe3763200
[ 2381.855141] RDX: 0000022a7fa0e539 RSI: 0000000000000000 RDI: ffff88afe37631c0
[ 2381.863133] RBP: ffffc9000c6a7dc8 R08: 00000066a1725854 R09: 000002490d303b39
[ 2381.871117] R10: ffff88afe376fba4 R11: ffff88afe376fb84 R12: ffff88afe37631c0
[ 2381.879109] R13: 0000022a7fa0e539 R14: ffff88afe3763700 R15: 0000000000000001
[ 2381.887107]  ? __hrtimer_next_event_base+0x6c/0xe0
[ 2381.892478]  elfcorehdr_read+0x40/0x40
[ 2381.896681]  tick_nohz_get_sleep_length+0x9d/0xc0
[ 2381.901955]  menu_select+0x4bb/0x630
[ 2381.905965]  cpuidle_select+0x16/0x20
[ 2381.910069]  do_idle+0x1d2/0x270
[ 2381.913689]  cpu_startup_entry+0x20/0x30
[ 2381.918086]  start_secondary+0x118/0x150
[ 2381.922484]  secondary_startup_64_no_verify+0xc3/0xcb
[ 2381.928147]  </TASK>
[ 2381.931535] Modules linked in: zram
[ 2381.936365] CR2: ffffc90077cb6e4b
[ 2381.940998] ---[ end trace 0000000000000000 ]---

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ