lists.openwall.net   lists  /  announce  owl-users  owl-dev  john-users  john-dev  passwdqc-users  yescrypt  popa3d-users  /  oss-security  kernel-hardening  musl  sabotage  tlsify  passwords  /  crypt-dev  xvendor  /  Bugtraq  Full-Disclosure  linux-kernel  linux-netdev  linux-ext4  linux-hardening  linux-cve-announce  PHC 
Open Source and information security mailing list archives
 
Hash Suite: Windows password security audit tool. GUI, reports in PDF.
[<prev] [next>] [<thread-prev] [thread-next>] [day] [month] [year] [list]
Date:   Thu, 10 Jan 2019 12:18:07 -0600
From:   Josh Poimboeuf <jpoimboe@...hat.com>
To:     Nadav Amit <namit@...are.com>
Cc:     X86 ML <x86@...nel.org>, LKML <linux-kernel@...r.kernel.org>,
        Ard Biesheuvel <ard.biesheuvel@...aro.org>,
        Andy Lutomirski <luto@...nel.org>,
        Steven Rostedt <rostedt@...dmis.org>,
        Peter Zijlstra <peterz@...radead.org>,
        Ingo Molnar <mingo@...nel.org>,
        Thomas Gleixner <tglx@...utronix.de>,
        Linus Torvalds <torvalds@...ux-foundation.org>,
        Masami Hiramatsu <mhiramat@...nel.org>,
        Jason Baron <jbaron@...mai.com>, Jiri Kosina <jkosina@...e.cz>,
        David Laight <David.Laight@...LAB.COM>,
        Borislav Petkov <bp@...en8.de>,
        Julia Cartwright <julia@...com>, Jessica Yu <jeyu@...nel.org>,
        "H. Peter Anvin" <hpa@...or.com>,
        Rasmus Villemoes <linux@...musvillemoes.dk>,
        Edward Cree <ecree@...arflare.com>,
        Daniel Bristot de Oliveira <bristot@...hat.com>
Subject: Re: [PATCH v3 0/6] Static calls

On Thu, Jan 10, 2019 at 05:32:08PM +0000, Nadav Amit wrote:
> > On Jan 10, 2019, at 8:44 AM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> > 
> > On Thu, Jan 10, 2019 at 01:21:00AM +0000, Nadav Amit wrote:
> >>> On Jan 9, 2019, at 2:59 PM, Josh Poimboeuf <jpoimboe@...hat.com> wrote:
> >>> 
> >>> With this version, I stopped trying to use text_poke_bp(), and instead
> >>> went with a different approach: if the call site destination doesn't
> >>> cross a cacheline boundary, just do an atomic write.  Otherwise, keep
> >>> using the trampoline indefinitely.
> >>> 
> >>> NOTE: At least experimentally, the call destination writes seem to be
> >>> atomic with respect to instruction fetching.  On Nehalem I can easily
> >>> trigger crashes when writing a call destination across cachelines while
> >>> reading the instruction on other CPU; but I get no such crashes when
> >>> respecting cacheline boundaries.
> >>> 
> >>> BUT, the SDM doesn't document this approach, so it would be great if any
> >>> CPU people can confirm that it's safe!
> >> 
> >> I (still) think that having a compiler plugin can make things much cleaner
> >> (as done in [1]). The callers would not need to be changed, and the key can
> >> be provided through an attribute.
> >> 
> >> Using a plugin should also allow to use Steven’s proposal for doing
> >> text_poke() safely: by changing 'func()' into 'asm (“call func”)', as done
> >> by the plugin, you can be guaranteed that registers are clobbered. Then, you
> >> can store in the assembly block the return address in one of these
> >> registers.
> > 
> > I'm no GCC expert (why do I find myself saying this a lot lately?), but
> > this sounds to me like it could be tricky to get right.
> > 
> > I suppose you'd have to do it in an early pass, to allow GCC to clobber
> > the registers in a later pass.  So it would necessarily have side
> > effects, but I don't know what the risks are.
> 
> I’m not GCC expert either and writing this code was not making me full of
> joy, etc.. I’ll be happy that my code would be reviewed, but it does work. I
> don’t think an early pass is needed, as long as hardware registers were not
> allocated.
> 
> > Would it work with more than 5 arguments, where args get passed on the
> > stack?
> 
> It does.
> 
> > 
> > At the very least, it would (at least partially) defeat the point of the
> > callee-saved paravirt ops.
> 
> Actually, I think you can even deal with callee-saved functions and remove
> all the (terrible) macros. You would need to tell the extension not to
> clobber the registers through a new attribute.

Ok, it does sound interesting then.  I assume you'll be sharing the
code?

> > What if we just used a plugin in a simpler fashion -- to do call site
> > alignment, if necessary, to ensure the instruction doesn't cross
> > cacheline boundaries.  This could be done in a later pass, with no side
> > effects other than code layout.  And it would allow us to avoid
> > breakpoints altogether -- again, assuming somebody can verify that
> > intra-cacheline call destination writes are atomic with respect to
> > instruction decoder reads.
> 
> The plugin should not be able to do so. Layout of the bytecode is done by
> the assembler, so I don’t think a plugin would help you with this one.

Actually I think we could use .bundle_align_mode for this purpose:

  https://sourceware.org/binutils/docs-2.31/as/Bundle-directives.html

-- 
Josh

Powered by blists - more mailing lists

Powered by Openwall GNU/*/Linux Powered by OpenVZ