linux-kernel - Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump

lists.openwall.net		lists / announce owl-users owl-dev john-users john-dev passwdqc-users yescrypt popa3d-users / oss-security kernel-hardening musl sabotage tlsify passwords / crypt-dev xvendor / Bugtraq Full-Disclosure linux-kernel linux-netdev linux-ext4 linux-hardening linux-cve-announce PHC
Open Source and information security mailing list archives

Hash Suite: Windows password security audit tool. GUI, reports in PDF.

[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]

Date:   Mon, 8 Oct 2018 10:25:51 -0700
From:   Andy Lutomirski <luto@...capital.net>
To:     Peter Zijlstra <peterz@...radead.org>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Ingo Molnar <mingo@...nel.org>,
        Andrew Morton <akpm@...ux-foundation.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
        mhelsley@...are.com,
        "Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
        David Woodhouse <dwmw2@...radead.org>,
        Paolo Bonzini <pbonzini@...hat.com>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Jason Baron <jbaron@...mai.com>, Jiri Kosina <jkosina@...e.cz>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Andrew Lutomirski <luto@...nel.org>
Subject: Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"

On Mon, Oct 8, 2018 at 9:40 AM Peter Zijlstra <peterz@...radead.org> wrote:
>
> On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
> >
> >
> > > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@...radead.org> wrote:
> > >
> > > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
> > >>> Can't we hijack the relocation records for these functions before they
> > >>> get thrown out in the (final) link pass or something?
> > >>
> > >> I could be talking out my arse here, but I thought we could do this,
> > >> too, then changed my mind.  The relocation records give us the
> > >> location of the call or jump operand, but they don’t give the address
> > >> of the beginning of the instruction.
> > >
> > > But that's like 1 byte before the operand, right? We could even double check
> > > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
> > >
> > > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
> > > so if we have the PLT32 location, we also have the instruction location. Or am
> > > I missing something?
> >
> > There’s also JMP and Jcc, any of which can be used for rail calls, but
> > those are also one byte. I suppose GCC is unlikely to emit a prefixed
> > form of any of these. So maybe we really can assume they’re all one
> > byte.
>
> Oh, I had not considered tail calls..
>
> > But there is a nasty potential special case: anything that takes the
> > function’s address. This includes jump tables, computed gotos, and
> > plain old function pointers. And I suspect that any of these could
> > have one of the rather large number of CALL/JMP/Jcc bytes before the
> > relocation by coincidence.
>
> We can have objtool verify the CALL/JMP/Jcc only condition. So if
> someone tries to take the address of a patchable function, it will error
> out.

I think we should just ignore the sites that take the address and
maybe issue a warning.  After all, GCC can create them all by itself.
We'll always have a plain wrapper function, and I think we should just
not patch code that takes its address.  So we do, roughly:

void default_foo(void);

GLOBAL(foo)
  jmp *current_foo(%rip)
ENDPROC(foo)

And code that does:

foo();

as a call, a tail call, a conditional tail call, etc, gets discovered
by objtool + relocation processing or whatever and gets patched.  (And
foo() itself gets patched, too, as a special case.  But we patch foo
itself at some point during boot to turn it into a direct JMP.  Doing
it this way means that the whole mechanism works from very early
boot.)  And anything awful like:

switch(whatever) {
case 0:
  foo();
};

that gets translated to a jump table and gets optimized such that it
jumps straight to foo just gets left alone, since it still works.
It's just a bit suboptimial.  Similarly, code that does:

void (*ptr)(void);
ptr = foo;

gets a bona fide pointer to foo(), and any calls through the pointer
land on foo() and jump to the current selected foo with only a single
indirect branch / retpoline.

Does this seem reasonable?  Is there a reason we should make it more
restrictive?