linux-kernel - Re: [PATCH 3/4] x86/ftrace: make ftrace_int3

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Message-ID: <CAHk-=wjphmrQXMfbw9j-tTzDvJ+Uc+asMHdFa=1_1xZoYVUC=g@mail.gmail.com>
Date:   Mon, 29 Apr 2019 11:22:51 -0700
From:   Linus Torvalds <torvalds@...ux-foundation.org>
To:     Steven Rostedt <rostedt@...dmis.org>
Cc:     Peter Zijlstra <peterz@...radead.org>,
        Nicolai Stange <nstange@...e.de>,
        Thomas Gleixner <tglx@...utronix.de>,
        Ingo Molnar <mingo@...hat.com>, Borislav Petkov <bp@...en8.de>,
        "H. Peter Anvin" <hpa@...or.com>,
        "the arch/x86 maintainers" <x86@...nel.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jiri Kosina <jikos@...nel.org>,
        Miroslav Benes <mbenes@...e.cz>,
        Petr Mladek <pmladek@...e.com>,
        Joe Lawrence <joe.lawrence@...hat.com>,
        Shuah Khan <shuah@...nel.org>,
        Konrad Rzeszutek Wilk <konrad.wilk@...cle.com>,
        Tim Chen <tim.c.chen@...ux.intel.com>,
        Sebastian Andrzej Siewior <bigeasy@...utronix.de>,
        Mimi Zohar <zohar@...ux.ibm.com>,
        Juergen Gross <jgross@...e.com>,
        Nick Desaulniers <ndesaulniers@...gle.com>,
        Nayna Jain <nayna@...ux.ibm.com>,
        Masahiro Yamada <yamada.masahiro@...ionext.com>,
        Andy Lutomirski <luto@...nel.org>,
        Joerg Roedel <jroedel@...e.de>,
        Linux List Kernel Mailing <linux-kernel@...r.kernel.org>,
        live-patching@...r.kernel.org,
        "open list:KERNEL SELFTEST FRAMEWORK" 
        <linux-kselftest@...r.kernel.org>
Subject: Re: [PATCH 3/4] x86/ftrace: make ftrace_int3_handler() not to skip
 fops invocation

On Mon, Apr 29, 2019 at 11:06 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
>
> It does *not* emulate the "call" in the BP handler itself, instead if
> replace the %ip (the same way all the other BP handlers replace the
> %ip) with a code sequence that just does
>
>         push %gs:bp_call_return
>         jmp *%gs:bp_call_target
>
> after having filled in those per-cpu things.

Note that if you read the patch, you'll see that my explanation
glossed over the "what if an interrupt happens" part. Which is handled
by having two handlers, one for "interrupts were already disabled" and
one for "interrupts were enabled, so I disabled them before entering
the handler".

The second handler does the same push/jmp sequence, but has a "sti"
before the jmp. Because of the one-instruction sti shadow, interrupts
won't actually be enabled until after the jmp instruction has
completed, and thus the "push/jmp" is atomic wrt regular interrupts.

It's not safe wrt NMI, of course, but since NMI won't be rescheduling,
and since any SMP IPI won't be punching through that sequence anyway,
it's still atomic wrt _another_ text_poke() attempt coming in and
re-using the bp_call_return/tyarget slots.

So yeah, it's not "one-liner" trivial, but it's not like it's
complicated either, and it actually matches the existing "call this
code to emulate the replaced instruction". So I'd much rather have a
couple of tens of lines of code here that still acts pretty much
exactly like all the other rewriting does, rather than play subtle
games with the entry stack frame.

Finally: there might be other situations where you want to have this
kind of "pseudo-atomic" replacement sequence, so I think while it's a
hack specific to emulating a "call" instruction, I don't think it is
conceptually limited to just that case.

                 Linus