linux-kernel - Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CABCJKudQme=bcNJtNCORUpBnVA_Pkr2Zk_3Nu2EsfnuoE7LJ3g@mail.gmail.com>
Date:   Mon, 1 Mar 2021 16:03:51 -0800
From:   Sami Tolvanen <samitolvanen@...gle.com>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     kernel test robot <oliver.sang@...el.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Kees Cook <keescook@...omium.org>,
        LKML <linux-kernel@...r.kernel.org>, lkp@...ts.01.org,
        kernel test robot <lkp@...el.com>
Subject: Re: [x86, build] 6dafca9780: WARNING:at_arch/x86/kernel/ftrace.c:#ftrace_verify_code

On Mon, Mar 1, 2021 at 3:45 PM Steven Rostedt <rostedt@...dmis.org> wrote:
>
> On Mon, 1 Mar 2021 14:14:51 -0800
> Sami Tolvanen <samitolvanen@...gle.com> wrote:
>
> > Basically, the problem is that ftrace_replace_code() expects to find
> > ideal_nops[NOP_ATOMIC5] here, which in this case is 66:66:66:66:90,
> > while objtool has replaced the __fentry__ call with 0f:1f:44:00:00.
> >
> > As ideal_nops changes depending on kernel config and hardware, when
> > CC_USING_NOP_MCOUNT is defined we could either change
> > ftrace_nop_replace() to always use P6_NOP5, or skip
> > ftrace_verify_code() in ftrace_replace_code() for
> > FTRACE_UPDATE_MAKE_CALL.
>
> So I hacked up the code to get -mnop-record to work on x86, and checked the
> vmlinux and it gives me:
>
> ffffffff81bc6120 <schedule>:
> ffffffff81bc6120:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> ffffffff81bc6125:       55                      push   %rbp
> ffffffff81bc6126:       65 48 8b 2c 25 c0 7d 01 00      mov    %gs:0x17dc0,%rbp ffffffff81bc612b: R_X86_64_32S  current_task
> ffffffff81bc612f:       53                      push   %rbx
> ffffffff81bc6130:       48 8b 45 18             mov    0x18(%rbp),%rax
>
>
> Which is the 0f:1f:44:00:00, and it works fine for me.
>
> Now, that could be because the ideal_nops[NOP_ATOMIC5] is the same, which
> would explain this.
>
> No, we should *not* change ftrace_nop_replace() to always use any P6_NOP5,
> as there was a reason we did this. Because not all nops are the same, and
> this gets called for *every* function that is traced.
>
> No, we should not skip ftrace_verify_code() *ever*. (/me was just
> referencing on twitter the scenario where ftrace bricked e1000e cards).
>
> This is probably why I never was much for the compiler conversion into nops,
> because it may chose the wrong one for the architecture.

Sure, makes sense. Should we just skip the conversion in objtool then
and let the kernel deal with it?

> What we could do, is if the nop chosen by the compiler is not the ideal
> nop, to go back and modify all the nops added by the compiler to the ideal
> one, which would keep it using the most efficient one.
>
> Or, add something like this:
>

[...]

>                 ret = ftrace_verify_code(rec->ip, old);
> +
> +               if (__is_defined(CC_USING_NOP_MCOUNT) && ret && old_nop) {
> +                       /* Compiler could have put in P6_NOP5 */
> +                       old = P6_NOP5;
> +                       ret = ftrace_verify_code(rec->ip, old);
> +               }
> +

Wouldn't that still hit WARN(1) in the initial ftrace_verify_code()
call if ideal_nops doesn't match?

Sami