linux-kernel - Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite for Android: free password hash cracker in your pocket

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-Id: <D0F82355-EB17-46A3-82AA-CC0B26344A08@gmail.com>
Date:   Tue, 20 Sep 2022 09:48:10 -0700
From:   Nadav Amit <nadav.amit@...il.com>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     "Steven Rostedt (Google)" <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>, X86 ML <x86@...nel.org>,
        Dave Hansen <dave.hansen@...ux.intel.com>,
        Borislav Petkov <bp@...en8.de>, Ingo Molnar <mingo@...hat.com>,
        Thomas Gleixner <tglx@...utronix.de>
Subject: Re: [RFC PATCH] x86/syscalls: allow tracing of __do_sys_[syscall]
 functions

On Sep 20, 2022, at 4:02 AM, Peter Zijlstra <peterz@...radead.org> wrote:

> On Mon, Sep 19, 2022 at 07:35:42PM -0700, Nadav Amit wrote:
> 
>> 1. What is the reason that inline functions are marked with notrace?
> 
> IIRC the concern is that a notrace function using an inline function;
> GCC deciding to not inline and then still hitting tracing.
> 
> For noinstr we've mandated __always_inline to avoid this problem. The
> direct advantage is that those inlined into instrumented code get, well,
> instrumented.

I fully understand the __always_inline. I do not understand the inline,
which is a hint. Anyhow, I just thought that you would probably know, but
I’ll do the digging and look at the tables to see how they look with and
without inline implying notrace.

> 
>> 2. Is probing function that is called from do_idle() supposed to work, or
>>   should the kernel prevent it?
> 
> Should work for some :-) Specifically it doesn't work for those that
> disable RCU, and that's (largely) being fixed here:
> 
>  https://lore.kernel.org/all/20220919095939.761690562@infradead.org/T/#u
> 
> Although looking at it just now, I think I missed a spot.. lemme go fix
> ;-)
> 

Thank you. I’ll give it a spin as soon as I finish some stuff (which can
be days).


> I'm failing to find this callchain; where is
> tick_nohz_get_sleep_length() calling to elfcorehdr_read() ?!?

Very strange. According to DWARF and disassembly, the call in the code is
actually to hrtimer_next_event_without() and nothing more, and
elfcorehdr_read+0x40/0x40 is actually after the ret.

The strangest part is that I actually collected additional similar crashes,
and I only now notice that all of them have elfcorehdr_read(). Good catch!
(which makes no sense)…

I’ll move to a newer kernel, apply your patches and dig into it too.

Thanks again,
Nadav



>> [ 2381.892478]  elfcorehdr_read+0x40/0x40
>> [ 2381.896681]  tick_nohz_get_sleep_length+0x9d/0xc0
>> [ 2381.901955]  menu_select+0x4bb/0x630
>> [ 2381.905965]  cpuidle_select+0x16/0x20
>> [ 2381.910069]  do_idle+0x1d2/0x270
>> [ 2381.913689]  cpu_startup_entry+0x20/0x30
>> [ 2381.918086]  start_secondary+0x118/0x150
>> [ 2381.922484]  secondary_startup_64_no_verify+0xc3/0xcb
>> [ 2381.928147]  </TASK>
>> [ 2381.931535] Modules linked in: zram
>> [ 2381.936365] CR2: ffffc90077cb6e4b
>> [ 2381.940998] ---[ end trace 0000000000000000 ]—