[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Message-ID: <CAKv+Gu-08i6QEahyYhBtCBmpOYNuxirgdsgGvRw+Y0DX3+DVNQ@mail.gmail.com>
Date: Mon, 8 Oct 2018 19:30:23 +0200
From: Ard Biesheuvel <ard.biesheuvel@...aro.org>
To: Andy Lutomirski <luto@...capital.net>
Cc: Peter Zijlstra <peterz@...radead.org>,
Steven Rostedt <rostedt@...dmis.org>,
LKML <linux-kernel@...r.kernel.org>,
Linus Torvalds <torvalds@...ux-foundation.org>,
Ingo Molnar <mingo@...nel.org>,
Andrew Morton <akpm@...ux-foundation.org>,
Thomas Gleixner <tglx@...utronix.de>,
Masami Hiramatsu <mhiramat@...nel.org>,
Mathieu Desnoyers <mathieu.desnoyers@...icios.com>,
Matthew Helsley <mhelsley@...are.com>,
"Rafael J. Wysocki" <rafael.j.wysocki@...el.com>,
David Woodhouse <dwmw2@...radead.org>,
Paolo Bonzini <pbonzini@...hat.com>,
Josh Poimboeuf <jpoimboe@...hat.com>,
Jason Baron <jbaron@...mai.com>, Jiri Kosina <jkosina@...e.cz>,
Andrew Lutomirski <luto@...nel.org>
Subject: Re: [POC][RFC][PATCH 1/2] jump_function: Addition of new feature "jump_function"
On 8 October 2018 at 19:25, Andy Lutomirski <luto@...capital.net> wrote:
> On Mon, Oct 8, 2018 at 9:40 AM Peter Zijlstra <peterz@...radead.org> wrote:
>>
>> On Mon, Oct 08, 2018 at 09:29:56AM -0700, Andy Lutomirski wrote:
>> >
>> >
>> > > On Oct 8, 2018, at 8:57 AM, Peter Zijlstra <peterz@...radead.org> wrote:
>> > >
>> > > On Mon, Oct 08, 2018 at 01:33:14AM -0700, Andy Lutomirski wrote:
>> > >>> Can't we hijack the relocation records for these functions before they
>> > >>> get thrown out in the (final) link pass or something?
>> > >>
>> > >> I could be talking out my arse here, but I thought we could do this,
>> > >> too, then changed my mind. The relocation records give us the
>> > >> location of the call or jump operand, but they don’t give the address
>> > >> of the beginning of the instruction.
>> > >
>> > > But that's like 1 byte before the operand, right? We could even double check
>> > > this by reading back that byte and ensuring it is in fact 0xE8 (CALL).
>> > >
>> > > AFAICT there is only the _1_ CALL encoding, and that is the 5 byte: E8 <PLT32>,
>> > > so if we have the PLT32 location, we also have the instruction location. Or am
>> > > I missing something?
>> >
>> > There’s also JMP and Jcc, any of which can be used for rail calls, but
>> > those are also one byte. I suppose GCC is unlikely to emit a prefixed
>> > form of any of these. So maybe we really can assume they’re all one
>> > byte.
>>
>> Oh, I had not considered tail calls..
>>
>> > But there is a nasty potential special case: anything that takes the
>> > function’s address. This includes jump tables, computed gotos, and
>> > plain old function pointers. And I suspect that any of these could
>> > have one of the rather large number of CALL/JMP/Jcc bytes before the
>> > relocation by coincidence.
>>
>> We can have objtool verify the CALL/JMP/Jcc only condition. So if
>> someone tries to take the address of a patchable function, it will error
>> out.
>
> I think we should just ignore the sites that take the address and
> maybe issue a warning. After all, GCC can create them all by itself.
> We'll always have a plain wrapper function, and I think we should just
> not patch code that takes its address. So we do, roughly:
>
> void default_foo(void);
>
> GLOBAL(foo)
> jmp *current_foo(%rip)
> ENDPROC(foo)
>
> And code that does:
>
> foo();
>
> as a call, a tail call, a conditional tail call, etc, gets discovered
> by objtool + relocation processing or whatever and gets patched. (And
> foo() itself gets patched, too, as a special case. But we patch foo
> itself at some point during boot to turn it into a direct JMP. Doing
> it this way means that the whole mechanism works from very early
> boot.)
Does that mean that architectures could opt out of doing the whole
objtool + relocation processing thing, and instead take the hit of
going through the trampoline for all calls?
> And anything awful like:
>
> switch(whatever) {
> case 0:
> foo();
> };
>
> that gets translated to a jump table and gets optimized such that it
> jumps straight to foo just gets left alone, since it still works.
> It's just a bit suboptimial. Similarly, code that does:
>
> void (*ptr)(void);
> ptr = foo;
>
> gets a bona fide pointer to foo(), and any calls through the pointer
> land on foo() and jump to the current selected foo with only a single
> indirect branch / retpoline.
>
> Does this seem reasonable? Is there a reason we should make it more
> restrictive?
Powered by blists - more mailing lists