lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 29 Nov 2018 11:27:00 -0800
From:   Andy Lutomirski <luto@...nel.org>
To:     Linus Torvalds <torvalds@...ux-foundation.org>
Cc:     Steven Rostedt <rostedt@...dmis.org>,
        Josh Poimboeuf <jpoimboe@...hat.com>,
        Peter Zijlstra <peterz@...radead.org>,
        Andrew Lutomirski <luto@...nel.org>, X86 ML <x86@...nel.org>,
        LKML <linux-kernel@...r.kernel.org>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Jason Baron <jbaron@...mai.com>, Jiri Kosina <jkosina@...e.cz>,
        David Laight <David.Laight@...lab.com>,
        Borislav Petkov <bp@...en8.de>, julia@...com, jeyu@...nel.org,
        "H. Peter Anvin" <hpa@...or.com>
Subject: Re: [PATCH v2 4/4] x86/static_call: Add inline static call
 implementation for x86-64

On Thu, Nov 29, 2018 at 11:08 AM Linus Torvalds
<torvalds@...ux-foundation.org> wrote:
>
> On Thu, Nov 29, 2018 at 10:58 AM Linus Torvalds
> <torvalds@...ux-foundation.org> wrote:
> >
> > In contrast, if the call was wrapped in an inline asm, we'd *know* the
> > compiler couldn't turn a "call wrapper(%rip)" into anything else.
>
> Actually, I think I have a better model - if the caller is done with inline asm.
>
> What you can do then is basically add a single-byte prefix to the
> "call" instruction that does nothing (say, cs override), and then
> replace *that* with a 'int3' instruction.
>
> Boom. Done.
>
> Now, the "int3" handler can just update the instruction in-place, but
> leave the "int3" in place, and then return to the next instruction
> byte (which is just the normal branch instruction without the prefix
> byte).
>
> The cross-CPU case continues to work, because the 'int3' remains in
> place until after the IPI.

Hmm, cute.  But then the calls are in inline asm, which results in
giant turds like we have for the pvop vcalls.  And, if they start
being used more generally, we potentially have ABI issues where the
calling convention isn't quite what the asm expects, and we explode.

I propose a different solution:

As in this patch set, we have a direct and an indirect version.  The
indirect version remains exactly the same as in this patch set.  The
direct version just only does the patching when all seems well: the
call instruction needs to be 0xe8, and we only do it when the thing
doesn't cross a cache line.  Does that work?  In the rare case where
the compiler generates something other than 0xe8 or crosses a cache
line, then the thing just remains as a call to the out of line jmp
trampoline.  Does that seem reasonable?  It's a very minor change to
the patch set.

Alternatively, we could actually emulate call instructions like this:

void __noreturn jump_to_kernel_pt_regs(struct pt_regs *regs, ...)
{
  struct pt_regs ptregs_copy = *regs;
  barrier();
  *(unsigned long *)(regs->sp - 8) = whatever;  /* may clobber old
regs, but so what? */
  asm volatile ("jmp return_to_alternate_ptregs");
}

where return_to_alternate_ptregs points rsp to the ptregs and goes
through the normal return path.  It's ugly, but we could have a test
case for it, and it should work fine.

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ